Saurav Jain (Open Source + Communities) Profile picture
Developer Community Manager @Apify โ€ข ex ๐Ÿฅ‘@amplication โ€ขOpen Sourceโ€ข Web Scraping/Automation โ€ข AI โ€ข Python โ€ข JavaScript โ€ขGitHub โ€ข Node.js

Jul 19, 2021, 9 tweets

๐—ฅ๐—ฒ๐—ถ๐—ป๐—ณ๐—ผ๐—ฟ๐—ฐ๐—ฒ๐—บ๐—ฒ๐—ป๐˜ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐Ÿค–

-A type of Machine Learning (What is it?)
-It's working
-Real-life Applications
-A book for learning RL
-Limitations

Let's learn about all these things in this thread!!

๐Ÿงต๐Ÿ‘‡

๐—ช๐—ต๐—ฎ๐˜ ๐—ถ๐˜€ ๐—ฅ๐—ฒ๐—ถ๐—ป๐—ณ๐—ผ๐—ฟ๐—ฐ๐—ฒ๐—บ๐—ฒ๐—ป๐˜ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด?

- A machine learning area.

-In RL an ML model is trained to take a sequence of decisions.

-If the model takes a correct decision it will get awarded otherwise negative points will be given.

Perfect Explanation๐Ÿ‘‡

๐—ช๐—ผ๐—ฟ๐—ธ๐—ถ๐—ป๐—ด

Environment: Physical world in which the agent operates

State: Current situation of the agent

Reward: Feedback from the env.

Policy: Method to map agentโ€™s state to actions

Value: Future reward that an agent would receive by taking an action in particular state

Letโ€™s take the game of PacMan where the goal of the agent (PacMan) is to eat the food in the grid while avoiding the ghosts on its way.

The grid world is the interactive environment for the agent.

PacMan receives a reward for eating food and punishment if it gets killed by the ghost (loses the game).

The states are the location of PacMan in the grid world and the total cumulative reward is PacMan winning the game.

๐—ฅ๐—ฒ๐—ฎ๐—น-๐—น๐—ถ๐—ณ๐—ฒ ๐—ฎ๐—ฝ๐—ฝ๐—น๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€

-Building AI for playing games.
-Robot Navigation
-Industrial Automation
-Dialog Agents(Text, Speech)
-Online Stock Trading

๐—•๐—ผ๐—ผ๐—ธ ๐—ณ๐—ผ๐—ฟ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐—ฅ๐—Ÿ

A good book to learn about Reinforcement Learning is "Reinforcement Learning: An Introduction" by Richard S. Sutton and Andrew G. Barto

You can download/buy the book from this link! ๐Ÿ‘‡

๐Ÿ”— incompleteideas.net/book/the-book-โ€ฆ

๐—Ÿ๐—ถ๐—บ๐—ถ๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€

-Training process can be time-consuming

-Can do many mistakes while learning

-Too much reinforcement learning can lead to an
overload of states, which can diminish the results.

-Reinforcement learning needs a lot of data and a lot of computation.

Hey All,

Hope I am able to make things easy to understand for you :)

If you like my content please retweet the first tweet of the thread!!

Sorry for not putting a thread yesterday, was busy with some personal stuff ๐Ÿ˜…

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling