Facebook just wrapped up its F8 developer conference, the company's biggest event of the year. And, in between talk of bots for Messenger, 360-degree video and virtual reality selfie sticks, one topic that was never far from the discussion was also one that we haven't heard Facebook discuss very openly in the past: artificial intelligence.
Throughout the two-day event, artificial intelligence came up again and again and on Wednesday, Joaquin Quiñonero Candela, Facebook's director of applied machine learning -- the group tasked with AI research -- took the stage to discuss how the company is tackling some of its biggest challenges.
Mashablesat down with Candela after the keynote to hear more about how Facebook's AI affects News Feed, ads and photos, and how it may -- one day -- also influence live video. The conversation has been lightly edited for length and clarity.
SEE ALSO:Facebook Messenger bots are here and they want to kill appsWe heard a lot today about where Facebook is investing in AI, like computer vision, for example, but at a more practical level right now, how does the typical Facebook user experience AI?
It's everywhere. Facebook today could not exist without AI and the reason for that is, it starts with the fundamentals like News Feed. If we were to not rank your News Feed at all -- we've done a lot of research on how much people would engage -- the experience would degrade so much that it's likely that a lot of people would just use Facebook a lot less than they do today so it's the fundamentals.
In ads, we have many millions of ads that are active on Facebook... it's definitely more than 2.5 million active advertisers. If we picked ads at random, you would hate ads. Believe me on that. These are two obvious examples where AI is essential.
Another obvious area is not in what you see but in what you don't see. With well over a billion images uploaded every day to Facebook, not all of them are fine. We have a lot of images that are offensive and objectionable. We've been working very hard the last couple of years to go from a mode where we would rely on people to report images that are offensive, and then those would get queued for a queue of human raters, to a mode where we have a computer vision system that takes down those images before anyone sees them. And the reason that's a big deal is that in the previous way we did things, the damage would be done already. There would be thousands of people who would have seen the offensive content. Now we can just take it down before anyone sees it.
I talked about this example of accessibility for the blind and visually impaired where the better we get at building algorithms that can understand and annotate images automatically, so that's a really cool example of an application of AI. And then there's all the research we're doing for the stuff that's coming.
Accessibility is a compelling example. Is there a way that this technology goes beyond accessibility, is there another step to that?
100%, there's many many many other applications. Just to brainstorm a couple: one of them would be to build amazing social image search. In my keynote I was talking about this hypothetical example, if you want to find a photo of what still is the most memorable family ski trip we've done. In the demo I showed, I was kind of lucky in that I was searching by images that are categorized a snow. Luckily the list of results isn't too big but you can imagine I have over a thousand friends on Facebook and you can have topics where you have tons and tons of images right?
So building image search that allows you to be a lot more specific, that sort of says 'here is what I was doing -- don't only give me images of outdoors, snow, ski and people.' That's kind of cool but it's going to be a ton of results right? I want to say there were exactly three people in this image and with the research we're doing, we may be able to say one day: 'I was wearing an orange jacket and a green helmet and there was a lake in the background.' It's the ability to go much much deeper in how you explain what you're searching for.
The other thing is, you hear us often talk about things like learning representations or unsupervised learning. What this means is that the more we can understand about the content of an object like an image, the better we can personalize the experience for people. And this may affect News Feed ranking as well because we only have a bunch of stats on how a story did once a few thousand people have seen it and then we can see 'well, how are they liking it?' But being able to learn people's preferences...I may be able to learn what kind of images are going to be appealing to based on the structure and content of the image itself.
The other thing is the more we are able to find an abstract language able to understand things like images the more we can do with every label we collect. With all these things you always need humans that annotate on all these images. We know that image is snow because we have annotated a couple thousand images that were snow and then we build algorithims that can generalize and find similar images. Pushing that generalization out to the limit requires being able to go and understand the structure. It's going to be direct product impact of us understanding images more deeply but it also feeds back into our ability to build better algorithims in the future.
On the video side, you talked about real time classification of videos, it seems like that's going to be really important since we're going to see so much around live video right now. Is that just around context for better recommendations, or is there another piece to that?
100% yeah, absolutely. It's going to be like trying to solve the needle in the haystack problem. If live video does as well as I believe it's going to do, the amount of content is going to be overwhelming. and the question is going to be well, live maps is cool, it allows us to find public live streams by location and that's cool, but in densely populated areas like Manhattan it's going to be crazy the amount of stuff that's going to be out there.
The question is how can we build power tools both for people but also maybe for video DJs. You can imagine having a whole sort of new medium emerging where people build their own channels based on live streams. But they're going to need the tools to browse those real-time, sort them out, eliminate all the stuff that's not interesting, being able to automatically find those and filter them out, [or] find all the ones that are about public figures -- election year, right? Let's find all the candidates form both parties and I can immediately see any live stream that contains [them]. I'm just making these up but you kind of see...
To your point, it's going to be both filtering out stuff you don't care about and filtering in the stuff you're interested in. And then ranking because the thing with filtering is, you could argue theoretically that we don't need to rank your News Feed if we give you a couple hundred buttons that allow you to say "stories from this friend on or off, stories about these topics on or off." The problem is you would quickly find that that's unwieldy, it's just too many buttons. I think, my opinion is there will be scope for both hard filtering but also just recommendations. And I don't think any of us really knows how this is going to unfold, I just know one thing: that my team is going to be really busy.
It seems like one of the biggest challenges -- and not just for Facebook but for everyone who's looking at these problems -- is combatting the assumption that this feels creepy to people. When you start talking about AI or facial recognition, it can be unsettling. How do you explain that to people?
Two thoughts on that: The first one is that at Facebook, inside Facebook, people's privacy is something we're obsessed with. So we have twice-a-year training where in the same way we say don't leak secrets and stuff like that, respect people's privacy and put it into whatever you design-- start there. That's deeply ingrained in our culture.
People's privacy is something we're obsessed with.
Then the second thing, which is a problem I think is going to be very hard to solve, is that you have this tension between utility and -- I don't really want to say privacy -- and personalization. If you were just a random number and we changed that random number every five seconds and that's all we know about you then none of the experiences that you have online today -- and I'm not only talking about Facebook -- would be really useful to you. You'd hate it. I would hate it. So there is value of course in being able to personalize experiences and make the access of information more efficient to you.
Now the one thing that I think is incredibly important is transparency. At Facebook, it's interesting; there's a huge amount of effort that goes into making it as clear as possible to people who will see this post. Like we keep redesigning things; we keep making it very obvious like 'You're about to post. You're friends are about to see this. So obviously if people don't get that then we're doing our job wrong. So people need to understand exactly who's going to see what.
The other thing we never, ever do is share anyone's information with anybody. So we will use the history of if you pick one of your friends and say they tend to post about two topics -- say politics and say more like family pictures. It's possible that you might be interested in their posts about politics because you think they have opinions that are worth reading, but it's possible that you don't want to see yet another picture of their dog or something like that. This is data that we keep about you for sure, but we don't ever share that information with anyone. The thing is how does this dialogue work -- how do we explain people exactly what we hold and the fact that we will never share it and have feel comfortable with the fact that this is going to help you.
Have something to add to this story? Share it in the comments.
TopicsArtificial IntelligenceFacebook
(责任编辑:時尚)