What is Machine Learning? Why Machine Learning?
Aaron 编译 | 来源 Commonlounge
Motivation behind Machine Learning
Sometimes we encounter problems for which it's really hard to write a computer program to solve. For example, let's say we wanted to program a computer to recognize hand-written digits.
You could imagine trying to devise a set of rules to distinguish each individual digit. Zeros, for instance, are basically one closed loop. But what if the person didn't perfectly close the loop. Or what if the right top of the loop closes below where the left top of the loop starts?
In this case, we have difficulty differentiating zeroes from sixes. We could establish some sort of cutoff, but how would you decide the cutoff in the first place? As you can see, it quickly becomes quite complicated to compile a list of heuristics (i.e., rules and guesses) that accurately classifies handwritten digits.
在这种情况下，我们（计算机程序）很难区分零和六。我们可以建立某种截止值 (cutoff) , 但是你将如何决定截止值 (cutoff)? 正如你所看到的，问题立刻变复杂了，当你要处理一系列这类“情况”(heuristics) (如：规则和猜测) 来准确的区分手写数字。
And there are so many more classes of problems that fall into this category. Recognizing objects, understanding concepts, comprehending speech. We don't know what program to write because we still don't know how it's done by our own brains. And even if we did have a good idea about how to do it, the program might be horrendously complicated.
So instead of trying to write a program, we try to develop an algorithm that a computer can use to look at hundreds or thousands of examples (and the correct answers), and then the computer uses that experience to solve the same problem in new situations. Essentially, our goal is to teach the computer to solve by example, very similar to how we might teach a young child to distinguish a cat from a dog.
What is Machine Learning? - Definition
什么是机器学习？ - 定义
The field itself: ML is a field of study which harnesses principles of computer science and statistics to create statistical models. These models are generally used to do two things:
Prediction: make predictions about the future based on data about the past
Inference: discover patterns in data
Difference between ML and AI : There is no universally agreed upon distinction between ML and artificial intelligence (AI). AI usually concentrates on programming computers to make decisions (based on ML models and sets of logical rules), whereas ML focuses more on making predictions about the future. They are highly interconnected fields, and, for most non-technical purposes, they are the same.
MI 和 AI 之间的区别： ML 和 人工智能 (AI) 并没有一个严格的区别。人工智能通常专注于编程计算机作出决策（基于ML模型和逻辑规则集），而ML 则更侧重于对未来进行预测。这两个领域关联十分紧密，大多数情况下，它们（非技术）的目标是一致的。
What's a statistical model?
Models: Teaching a computer to make predictions involves feeding data into machine learning models, which are representations of how the world supposedly works. If I tell a statistical model that the world works a certain way (say, for example, that taller people make more money than shorter people), then this model can then tell me who it thinks will make more money, between Cathy, who is 5'22'', and Jill, who is 5'9''.
模型： 教计算机进行预测，包括提供数据给机器学习模型，模型是这个世界如何运行的表达方式。如果我告诉一个统计模型这个世界的某种运作方式（比如，高个赚钱比矮子多），那么这个模型可以告诉我，Cathy (5'2'') 和 Jill (5'9'') 谁赚的钱多。
What does a model actually look like? Surely the concept of a model makes sense in the abstract, but knowing this is just half the battle. You should also know how it's represented inside of a computer, or what it would look like if you wrote it down on paper.
A model is just a mathematical function, which, as you probably already know, is a relationship between a set of inputs and a set of outputs. Here's an example:
模型只是一个数学函数 (function), 它是一组输入和输出的关系。这是一个例子：
f（x）= x 2
This is a function that takes as input a number and returns that number squared. So, f(1) = 1, f(2) = 4, f(3) = 9.
这是一个数学函数，它将一个数字作为输入，并返回该数字的平方。所以，f(1)=1, f(2)=4, f(3)=9。
Let's briefly return to the example of the model that predicts income from height. I may believe, based on what I've seen in the corporate world, that a given human's annual income is , on average, equal to her height (in inches) times 1,000. So, if you're 60 inches tall (5 feet), then I'll guess that you probably make $60,000 a year. If you're a foot taller, I think you'll make $72,000 a year.
让我们回到根据身高预测收入模型的例子。根据我的观察，我相信，一个人的平均年收入等于他的身高（英寸）乘以 1,000。所以，如果你身高 60 英尺，那么我估计你每年可能挣 6 万美元。如果你高一点，我想你一年会赚 72,000 美元。
This model can be represented mathematically as follows:
Income = Height × $1,000
收入 = 身高 x 1000 美元
In other words, income is a function of height.
Here's the main point: Machine Learning refers to a set of techniques for estimating functions (like the one involving income) based on datasets (pairs of heights and their associated incomes). These functions, which are called models, can then be used for predictions of future data.
Algorithms: These functions are estimated using algorithms. In this context, an algorithm is a predefined set of steps that takes as input a bunch of data and then transforms it through mathematical operations. You can think of an algorithm like a recipe — first do this, then do that, then do this. Done.
算法：这些函数是使用算法来实现的。在这种情况下，算法是一组预定义的步骤，将一组数据作为输入，然后通过数学运算对其进行转换。你可以把算法想成一个秘诀 — 先做这个，再做那个，然后再做这个，就完成了！
Machine learning of all types uses models and algorithms as its building blocks to make predictions and inferences about the world.
What exactly is being learnt
To explain what is being learnt in machine learning, let's start with an example application, spam classification. One approach to write a computer program to classify spam emails from non-spam emails, is to split each email into individual words and maintain a list of words that appear more frequently in spam emails. For example, some example of such words might be 'loan', '$', 'credit', 'discount', 'offer', 'password', 'viagra', and so on. Then, if an email has a substantial number of these words, it should be classified as spam.
Although the strategy above might give fairly good results (say detect spam with an accuracy of 80%), the accuracy depends in large part on the list of words we maintain, and on the precise threshold we choose to classify an email as spam.
虽然上面的策略可能会有相当好的结果（比如检测到垃圾邮件的准确率为80%）， 但这个准确性，很大程度上取决于我们维护的单词列表，以及我们选择将电子邮件归类为垃圾邮件的准确阀值（precise threshold）。
In machine learning, the strategy is to learn the list of words and the threshold from examples. In fact, in addition to which words are bad words, we could also learn how bad each word is. (This example is quite realistic, and is how many spam classification algorithms work.)
在机器学习中，我们的策略是学习上述的单词列表，以及阀值 (threshold) 。事实上，我们不单学习到了单词的好坏，还学到了单词的好坏程度。（这是一个很实际的例子，而且很多垃圾邮件分类算法就是这样动作的。）
So in this case, the thing being learnt is, a notion of how bad each word is. Note that that is not the only way to frame the problem, we framed the problem in this way because we noticed a pattern that spam emails often contain specific words, and then we came up with a strategy that would analyze every possible word as a possible suspect. This strategy might give inaccurate results for other tasks, or be too inefficient.
在这个例子，机器学习学到一个见解 - 对单词的好坏程度的见解。请注意，这不是解决问题的唯一方法，我们以这种方式构思问题，因为我们注意到垃圾邮件通常包含特定单词这个模式，然后我们提出了一种策略来尽可能去怀疑每个可疑的单词。对于其它任何，用这种策略可能结果准确率不高，或者效率太低。
Desirable properties of machine learning
You might notice that using machine learning to learn how bad each word is has many desirable properties over maintaining this list manually.
It reduces the amount of manual work involved in creating the list. Think about how long this list could get if you try to do this manually. Also, if you're trying to maintain the list manually, how would you deal with hundreds of languages across the world? This task can easily become infeasible without machine learning.
The same strategy works for other similar tasks. Say we wanted to classify whether a movie review is speaking positively or negatively about a movie. If we were creating lists of words manually, then we would have to create a new list of words manually. But if we learn it , the same algorithm would work given that we already have some data (say ratings and reviews left by users on imdb).
It updates automatically. Let's say tomorrow the spammers become more advanced and start typing the word 'password' as 'password'. Or they might try to sell you insurance, something we haven't yet encountered. We can simply set the machine learning algorithm to be tranined daily, and it will use the new data available and keep adapting over time to changing behavior.
— 你也许还想看 —
GHBD | 旨在推广医疗大数据和人工智能发展
Global Healthcare Big Data
在已成功举办第二届环球医疗大数据研讨会（2017）、第一届国际云、移动和大数据研讨会（2015），并分别在斯坦福大学医学中心（2016）、香港大学（2016）和北京大学大数据中心（2017）举办了3次环球医疗大数据工作会议成果基础上。我们的目标是为国内外行业领域专家，搭建一个持续的国际平台，组成一个独特的专业群体，让政府机构、医疗从业者、科技研究人员和国内外学者等信息化专业人士从世界各地汇聚在一起相互交流未来医院 IT 发展的重要思想和成果。