https://www.cs.berkeley.edu/~jordan/courses/294-fall09/lectures/feature/slides.pdf
http://www.cs.princeton.edu/courses/archive/spring10/cos424/slides/18-feat.pdf
http://www.cs.princeton.edu/courses/archive/spring10/cos424/w/rinfo
https://www.kaggle.com/c/springleaf-marketing-response/forums/t/16676/good-examples-of-feature-engineering
http://www.cs.stanford.edu/people/chrismre/papers/mythical_man.pdf
http://datascience.stackexchange.com/questions/8286/are-there-any-tools-for-feature-engineering
http://pslcdatashop.org/KDDCup/workshop/papers/kdd2010ntu.pdf
http://dspace.mit.edu/handle/1721.1/90409
http://homes.cs.washington.edu/~pedrod/
https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
At the end of the day, some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used. If you have many independent features that each correlate well with the class, learning is easy. On the other hand, if the class is a very complex function of the features, you may not be able to learn it.
1 Answer
Sean Gerrish, PhD in ML, Tech lead of stats-y machine-learning-y groups at Google
Sometimes people using a machine learning algorithm implement it themselves. This is a project that might take an hour or a week of someone's time (or more, depending on the algorithm and the skill of the person implementing it). An "off the shelf" algorithm is one that has been implemented by someone else and is available in a library. Usually this means it has been so in a fairly generic way, and usually there will be some room for improvement (e.g., by selecting or transforming features).
@article{Domingos:2012:FUT:2347736.2347755, author = {Domingos, Pedro}, title = {A Few Useful Things to Know About Machine Learning}, journal = {Commun. ACM}, issue_date = {October 2012}, volume = {55}, number = {10}, month = oct, year = {2012}, issn = {0001-0782}, pages = {78--87}, numpages = {10}, url = {http://doi.acm.org/10.1145/2347736.2347755}, doi = {10.1145/2347736.2347755}, acmid = {2347755}, publisher = {ACM}, address = {New York, NY, USA}, }