 The Research Behind Flash Fill

I predict that one of the most controversial new features in Excel 2013 will be Flash Fill. Here's a video that demonstrates how it works. It's an AI alternative to extracting specific  parts of text strings. For example, if you have  a column of names, you can type a few last names and Flash Fill will figure out what you're doing, and fill in the empty cells.

It's a great concept, but it can also lead to lots of bad data. I think many users will look at a few "flash filled" cells, and just assume that it worked. But my preliminary tests leads me to this conclusion: Be very careful.

Flash Fill works in two ways: (1) It can extract data by example, and it can (2) create data by example. Creating data seems to be much more reliable. But use caution when extracting data. For example, most of the extracted data will be fine. But there might be exceptions that you don't notice unless you examine the results very carefully.

For the technically-minded, here's a Microsoft Research report (PDF) that seems to be the basis for this feature:  Automating String Processing in Spreadsheets Using Input-Output Examples.

