---
Want to read these updates before anyone else? Subscribe to our newsletter.
愛知県にある豊田高専のスペースデザイン研究部のブログです。
スペースデザイン研究部は主に建築学科の学生が集まり、
建築について学び、議論する部です。
このブログでは活動報告などをメンバーが交代で書いていきます。
VS. |
Because naïve Bayes is often called "idiot's Bayes." As you'll see, you get to make lots of sloppy, idiotic assumptions about your data, and it still works! It's like the splatter-paint of AI models, and because it's so simple and easy to implement (it can be done in 50 lines of code), companies use it all the time for simple classification jobs.Every chapter is like this and better. You never know what Foreman's going to say next, but you quickly expect it to be entertaining. Case in point, the next chapter is on optimization modeling using an example of, what else, commercial-scale orange juice mixing. It's just wild; you can't make this stuff up. Well, Foreman can make it up, it seems. The examples weren't just whimsical and funny, they were solid examples that built up throughout the chapter to show multiple levels of complexity for each model. I was constantly impressed with the instructional value of these examples, and how working through them really helped in understanding what to look for to improve the model and how to make it work.
You get more bang for your buck spending your time on selecting good data and features than models. For example, in the problem I outlined in this chapter, you'd be better served testing out possible new features like "customer ceased to buy lunch meat for fear of listeriosis" and making sure your training data was perfect than you would be testing out a neural net on your old training data.We're into chapter 7 now with ensemble models. This technique takes a bunch of simple, crappy models and improves their performance by putting them to a vote. The same pregnancy data was used from the last chapter, but with this different modeling approach, it's a new example. The next chapter introduces forecasting models by attempting to forecast sales for a new business in sword-smithing. This example was exceptionally good at showing the build-up from a simple exponential smoothing model to a trend-corrected model and then to a seasonally-corrected cyclic model all for forecasting sword sales.
Why? Because the phrase "garbage in, garbage out" has never been more applicable to any field than AI. No AI model is a miracle worker; it can't take terrible data and magically know how to use that data. So do your AI model a favor and give it the best and most creative features you can find.
As I've learned in the other data science books, so much of data analysis is about cleaning and munging the data. Running the model(s) doesn't take much time at all.
In fact, there are many applications where we are not only interested in the predicted class labels, but where the estimation of the class-membership probability is particularly useful (the output of the sigmoid function prior to applying the threshold function). Logistic regression is used in weather forecasting, for example, not only to predict if it will rain on a particular day but also to report the chance of rain. Similarly, logistic regression can be used to predict the chance that a patient has a particular disease given certain symptoms, which is why logistic regression enjoys great popularity in the field of medicine.The sigmoid function is a fundamental tool in machine learning, and it comes up again and again in the book. Midway through the chapter, they introduce three new algorithms: support vector machines (SVM), decision trees, and K-nearest neighbors. This is the first chapter where we see an odd organization of topics. It seems like the first part of the chapter really belonged with chapter 2, but including it here instead probably balanced chapter length better. Chapter length was quite even throughout the book, and there were several cases like this where topics were spliced and diced between chapters. It didn't hurt the flow much on a complete read-through, but it would likely make going back and finding things more difficult.
A better way of using the holdout method for model selection is to separate the data into three parts: a training set, a validation set, and a test set. The training set is used to fit the different models, and the performance on the validation set is then used for the model selection. The advantage of having a test set that the model hasn't seen before during the training and model selection steps is that we can obtain a less biased estimate of its ability to generalize to new data.It seems odd that a separate test set isn't enough, but it's true. Training a machine isn't as simple as it looks. Anyway, the next chapter circles back to ensemble learning with a more detailed look at bagging and boosting. (Machine learning has such creative names for things, doesn't it?) I'll leave the explanations to the book and get on with the review, so the next chapter works through an extended example application to do sentiment analysis of IMDb movie reviews. It's kind of a neat trick, and it uses everything we've learned so far together in one model instead of piecemeal with little stub examples. Chapter 9 continues the example with a little web application for submitting new reviews to the model we trained in the previous chapter. The trained model will predict whether the submitted review is positive or negative. This chapter felt a bit out of place, but it was fine for showing how to use a model in a (semi-)real application.
However, logistic activation functions can be problematic if we have highly negative input since the output of the sigmoid function would be close to zero in this case. If the sigmoid function returns output that are close to zero, the neural network would learn very slowly and it becomes more likely that it gets trapped in the local minima during training. This is why people often prefer a hyperbolic tangent as an activation function in hidden layers.And they're trained with various types of back-propagation. Chapter 12 shows how to implement neural networks from scratch, and chapter 13 shows how to do it with TensorFlow, where the network can end up running on the graphics card supercomputer inside your PC. Since TensorFlow is a complex beast, chapter 14 gets into the nitty gritty details of what all the pieces of code do for implementation of the handwritten digit identifier we saw in the last chapter. This is all very cool stuff, and after learning a bit about how to do the CUDA programming that's behind this library with CUDA by Example, I have a decent appreciation for what Google has done with making it as flexible, performant, and user-friendly as they can. It's not simple by any means, but it's as complex as it needs to be. Probably.
Developer: | Sega Studios Australia | | | Release Date: | 2013 | | | Systems: | Win, PS3, Xbox 360, iOS, Windows Phone, Android, OS X |
Out of all the privacy-focused products and apps available on the market, Brave has been voted the best. Other winners of Product Hunt's Golden Kitty awards showed that there was a huge interest in privacy-enhancing products and apps such as chats, maps, and other collaboration tools.
Last year has been a pivotal one for the crypto industry, but few companies managed to see the kind of success Brave did. Almost every day of the year has been packed witch action, as the company managed to officially launch its browser, get its Basic Attention Token out, and onboard hundreds of thousands of verified publishers on its rewards platform.
Luckily, the effort Brave has been putting into its product hasn't gone unnoticed.
The company's revolutionary browser has been voted the best privacy-focused product of 2019, for which it received a Golden Kitty award. The awards, hosted by Product Hunt, were given to the most popular products across 23 different product categories.
Ryan Hoover, the founder of Product Hunt said:
"Our annual Golden Kitty awards celebrate all the great products that makers have launched throughout the year"
Brave's win is important for the company—with this year seeing the most user votes ever, it's a clear indicator of the browser's rapidly rising popularity.
If reaching 10 million monthly active users in December was Brave's crown achievement, then the Product Hunt award was the cherry on top.
The recognition Brave got from Product Hunt users shows that a market for privacy-focused apps is thriving. All of the apps and products that got a Golden Kitty award from Product Hunt users focused heavily on data protection. Everything from automatic investment apps and remote collaboration tools to smart home products emphasized their privacy.
AI and machine learning rose as another note-worthy trend, but blockchain seemed to be the most dominating force in app development. Blockchain-based messaging apps and maps were hugely popular with Product Hunt users, who seem to value innovation and security.
For those users, Brave is a perfect platform. The company's research and development team has recently debuted its privacy-preserving distributed VPN, which could potentially bring even more security to the user than its already existing Tor extension.
Brave's effort to revolutionize the advertising industry has also been recognized by some of the biggest names in publishing—major publications such as The Washington Post, The Guardian, NDTV, NPR, and Qz have all joined the platform. Some of the highest-ranking websites in the world, including Wikipedia, WikiHow, Vimeo, Internet Archive, and DuckDuckGo, are also among Brave's 390,000 verified publishers.