Thursday, July 24, 2014

Should move-only types ever be passed by value?

[For this post, I'm going to pretend that std::unique_ptr is a type, instead of a template, because the issue being examined is independent of what a std::unique_ptr points to.]

Suppose I want to pass a std::unique_ptr to a constructor, where the std::unique_ptr will be moved into a data member. The std::unique_ptr parameter thus acts as a sink. To the extent that we have enough experience with C++11 for wisdom about it to be conventional, said wisdom seems to be that the std::unique_ptr should be passed by value. In his GotW 91 solution, Herb Sutter argues for it. The High Integrity Coding Standard has it as a guideline. (It cites Herb's article as the source.) In his C++ Reference Guide, Danny Kalev argues for it. Many StackOverflow answers repeat this advice.

But recently Matthew Fioravante brought a StackOverflow question to my attention showing a problem resulting from declaring a std::unique_ptr by-value sink parameter, and later Matthew suggested that sink parameters of move-only types should be passed by rvalue reference. This is a very interesting idea.

Suppose you see this function signature:
void f(SomeType&& param);
What does this tell you about param? The fact that it's an rvalue reference tells you that it's a candidate to be moved from, and the usual expectation is that it will be. In other words, it's a sink parameter. Note that this is completely independent of param's type. Even without knowing anything about SomeType, we can conclude that param is a sink parameter.

If SomeType happens to be std::unique_ptr, nothing changes: param is still a sink. There's no need for a special rule for std::unique_ptrs that tells us to pass them by value to indicate that they're sinks, because we already have a way to unambiguously say that: pass them by rvalue reference.

Going back to the idea of a constructor moving a std::unique_ptr into a data member, this is what the code looks like using pass by value:
class Widget {
public:
  explicit Widget(std::unique_ptr ptr): p(std::move(ptr)) {}

private:
  std::unique_ptr p;
};
Now consider this calling code:
std::unique_ptr up;

Widget(std::move(up));
What's the cost of getting up into p? Well, the parameter ptr has to be constructed, and the data member p does, too. Each costs a move construction, so the total cost is two move constructions (modulo optimizations).

Now consider the same thing using pass by rvalue reference:

class Widget {
public:
  explicit Widget(std::unique_ptr&& ptr): p(std::move(ptr)) {}

private:
  std::unique_ptr p;
};

std::unique_ptr up;

Widget(std::move(up));
Here, only the data member p will be constructed, so the total cost is only one move construction.

Unless I'm overlooking something, passing sink parameters of type std::unique_ptr by value is inconsistent with our usual idiom for expressing the idea of a sink parameter (i.e., to pass by rvalue reference), and it's less efficient, too. My sense is that the conventional wisdom regarding sink parameters of type std::unique_ptr is all messed up.

Which leads to the question: how did it get messed up?  I believe what happened was that people noticed that for maximal efficiency when passing lvalues and rvalues of a particular type that needed to be copied inside the function, you needed to either overload on lvalue references and rvalue references, or you needed to pass by universal reference. Both approaches have problems (overloading doesn't scale to multiple parameters, and universal references suffer from the shortcomings of perfect forwarding, lousy error messages, and sometimes being too greedy). For cheap-to-move types, people found, you can use pass by value with only a modest efficiency loss, and the conventional wisdom, in large part based on a David Abrahams' blog post, "Want Speed? Pass by value", came to embrace that idea.

The thing is, for move-only types like std::unique_ptr, you don't need to worry about dealing with lvalues, because lvalues get copied, and move-only types aren't copyable. So there's no need to overload for lvalues and rvalues, hence no scalability problem for multiple parameters. Which means that the motivations for replacing pass by reference--which is what the conventional wisdom from C++98 always dictated--with pass by value don't exist for move-only types.

My feeling is that Matthew Fioravante may well have hit the nail on the head here: there is no reason to use by-value parameters to express "sinkness" for move-only types. Instead, the usual rule of passing sink parameters by rvalue reference should apply.

The special case of considering the use of pass by value for always-copied parameters really only applies to types that are both copyable and movable, and only in situations where overloading and the use of a universal reference is not desired.

What do you think? Is there ever a time where move-only types should be passed by value?

Scott

Saturday, July 19, 2014

Free Excerpt from Draft EMC++ Now Available

O'Reilly has made the TOC, Introduction, and first chapter ("Deducing Types") from the draft version of Effective Modern C++ available for free download. That's roughly the first 40 pages of the book, including Items 1-4. The PDF is available here.

I hope you enjoy this sample content, but I can't resist reminding you that this is from a draft manuscript. The final version will be better.

As always, I welcome suggestions for how this material can be improved.

Scott

Friday, July 18, 2014

Is the non-pointer syntax for declaring function pointer parameters not worth mentioning?

In my view, one of the most important favors a technical writer can do for his or her readers is shield them from information they don't need to know. One of my standard criticisms of authors is "S/he knows a lot. S/he wrote it all down." (Yes, I know: the "his or her" and "s/he" stuff is an abomination. Please suffer in silence on that. There's a different fish I want to fry in this post.)

I try not to commit the sin of conveying unimportant information, but one of the hazards of spending decades in this business is that you learn a lot. After a while, it can be hard to evaluate what's worth knowing and what's best left unsaid. With that in mind, the current draft of Effective Modern C++ contains this passage (more or less):
Suppose our function f can have its behavior customized by passing it a function that does some of its work. Assuming this function takes and returns ints, f could be declared like this:
     void f(int (*pf)(int));         // pf = "processing function"
It’s worth noting that f could also be declared using a simpler non-pointer syntax. Such a declaration would look like this, though it’d have the same meaning as the declaration above:
     void f(int pf(int));            // declares same f as above

One of my reviewers argues that the second way of declaring a function pointer parameter is not only not worth noting, it's not worth knowing. That means I should omit it. But I find myself reluctant to do that. I like the fact that function pointer parameters can be declared without throwing in additional parentheses and figuring out which side of the asterisk the parameter name has to go on. But maybe that's just me.

So what do you think? Should I jettison the aside about the asterisk-free way to declare the parameter pf, or should I keep it?

Thanks,

Scott

Friday, July 11, 2014

Draft Version of Effective Modern C++ Now Available

A full draft of Effective Modern C++ is now available through O'Reilly's Early Release program and Safari Books Online's Rough Cuts program.

The cover design for the book has changed. I'm the common crane no longer. Now I'm the fruit dove. I'm not sure what to make of this. Is going from common to fruity an upgrade?

My editor explained the reason for the avian switcheroo:
Since we are going full color for the book, the animal on the cover is going to be full color. The common crane isn't super colorful so we are switching to the fruit dove because it is filled with color.
During the early release it will be in black and white and the color will make its appearance soon.
The big information here is that O'Reilly has decided to print the book in full color. That's a first for me, and it means that the print book will display the different colors I originally put into the manuscript with only digital publishing in mind.

Well, the print book might display those colors. I'm not preparing the formatted content for this book. Instead, I'm working on a manuscript into which I put all the logical information that O'Reilly will need to produce the appropriate formatting for the files that will ultimately be used for publication and distribution. These will include PDF, ePub, and Kindle-compatible .mobi. So text in my manuscript that's red or blue or tan may not show up in those colors in the final product.

However, those colors will show up in the Early Release version of the book, because that document is simply a PDF print-out of the manuscript I'm working on. Please don't assume that the Early Release document reflects what the final book will look like. Even if O'Reilly chooses to use the colors I selected, they'll definitely clean up many aspects of the page layout that I designed to be seen only by me and my technical reviewers.

The Early Release book reflects the state of the manuscript as of about six weeks ago. Since then, I've read through the several hundred comments I got from my reviewers. Of the 315 pages in the manuscript, only 91 survived unscathed. That means that over 70% of the pages in the Early Release version have problems (that I know about). Some issues are small, e.g., misspellings or grammatical errors. Others involve important technical shortcomings that I'll have to think hard about how to address. To date, my favorite error report concerns a page in the manuscript where I managed to mis-translate a simple C++14 lambda expression into a corresponding C++11 lambda or call to std::bind in three different ways. I got it wrong. Three times. In three different ways. On one page. 

The draft book is available either as a standalone purchase or though Safari Books Online. At Safari, the book is currently associated with yet a third cover design (a hummingbird), but I believe we're going with the dove. Yet maybe not. I'm less in the loop on this stuff than you might think an author would be.

If you read the draft Items I posted earlier (here and here), you may be disappointed or even alarmed to see that the Items in the Early Release draft are essentially the same as what I posted. Relax. In the past few months, I've been focusing on finishing a full draft of the book, so work on revising existing Items got pushed to the back burner. I'm tending to those burners now, so the final versions of the Items will include changes in my thinking based on the comments I've received. 

I'm a little nervous about making this draft available. My normal modus operandi is to make things as good as I can before I publish them. What you're seeing in this case is a draft that was as good as I could make it before I sent it out for technical review. Based on the comments of my reviewers, I'd say it's not in bad shape, but there are a lot of places (some 70% of them) that need work. If you see opportunities for improvement, feel free to let me know about them, either by using the O'Reilly or Safari online errata reporting mechanisms or by sending me email (smeyers@aristeia.com). I don't expect to have time to respond to or to even acknowledge every report that comes in, but I assure you that I'll read them all and do my best to incorporate them into the final version of the book.

Scott