Login

solve and explain please cheak this is a COE topics HW ...

80.2K

Verified Solution

Question

Accounting

solve and explain please

cheak this is a COE topics HW

Question 4: (6 Points) Consider a (2X3) game world that has 6 states {A,B,C,D,E,F} and four actions (up, down, left, right) as shown below. For every new episode, the game starts by choosing a random state and it ends at the terminal state (F). When node F is reached, the player receives a reward of +10 and the game ends. For all other actions that do not lead to state F, the reward is -1. Assume that the greedy policy is used after training. Also, assume that =1 and =0.9. Assume that the Q-leaming algorithm was applied, and the following is the initial Q function Q(s,a), where s is a state and a is an action. State \action Up down left right A. Using the initial Q function, perform one action ( B, up) and update the Q function [2 bts! B. Using the initial Q function, perform one episode and update the Q table starting from state A. Note that an episode is defined as full game from a given state until the game ends. [4 pts] Question 4: (6 Points) Consider a (2X3) game world that has 6 states {A,B,C,D,E,F} and four actions (up, down, left, right) as shown below. For every new episode, the game starts by choosing a random state and it ends at the terminal state (F). When node F is reached, the player receives a reward of +10 and the game ends. For all other actions that do not lead to state F, the reward is -1. Assume that the greedy policy is used after training. Also, assume that =1 and =0.9. Assume that the Q-leaming algorithm was applied, and the following is the initial Q function Q(s,a), where s is a state and a is an action. State \action Up down left right A. Using the initial Q function, perform one action ( B, up) and update the Q function [2 bts! B. Using the initial Q function, perform one episode and update the Q table starting from state A. Note that an episode is defined as full game from a given state until the game ends. [4 pts] Question 4: (6 Points) Consider a (2X3) game world that has 6 states {A,B,C,D,E,F} and four actions (up, down, left, right) as shown below. For every new episode, the game starts by choosing a random state and it ends at the terminal state (F). When node F is reached, the player receives a reward of +10 and the game ends. For all other actions that do not lead to state F, the reward is -1. Assume that the greedy policy is used after training. Also, assume that =1 and =0.9. Assume that the Q-leaming algorithm was applied, and the following is the initial Q function Q(s,a), where s is a state and a is an action. State \action Up down left right A. Using the initial Q function, perform one action ( B, up) and update the Q function [2 bts! B. Using the initial Q function, perform one episode and update the Q table starting from state A. Note that an episode is defined as full game from a given state until the game ends. [4 pts] Question 4: (6 Points) Consider a (2X3) game world that has 6 states {A,B,C,D,E,F} and four actions (up, down, left, right) as shown below. For every new episode, the game starts by choosing a random state and it ends at the terminal state (F). When node F is reached, the player receives a reward of +10 and the game ends. For all other actions that do not lead to state F, the reward is -1. Assume that the greedy policy is used after training. Also, assume that =1 and =0.9. Assume that the Q-leaming algorithm was applied, and the following is the initial Q function Q(s,a), where s is a state and a is an action. State \action Up down left right A. Using the initial Q function, perform one action ( B, up) and update the Q function [2 bts! B. Using the initial Q function, perform one episode and update the Q table starting from state A. Note that an episode is defined as full game from a given state until the game ends. [4 pts]

Answer & Explanation Solved by verified expert

answer-section

Get Answers to Unlimited Questions

Join us to gain access to millions of questions and expert answers. Enjoy exclusive benefits tailored just for you!

Membership Benefits:

Unlimited Question Access with detailed Answers
Zin AI - 3 Million Words
10 Dall-E 3 Images
20 Plot Generations
Conversation with Dialogue Memory
No Ads, Ever!
Access to Our Best AI Platform: Zin AI - Your personal assistant for all your inquiries!

Become a Member

Other questions asked by students

Q

Determine the slope of the line passing through the given points 10 4 and 2...

Geometry

Q

LAB #1Chapter 1Cost ClassificationsOBJECTIVE: Apply appropriate cost classifications anduse to determine average...LAB #1Chapter 1Cost...

Accounting

Q

Write a 2-3 paragraph paper evaluating the lifetime NPV of yourcollege degree and post to the...

Finance

Q

Watson consulting, llc is a consultancy to consultants. They have bonds which have a face value...

Finance

Q

Use the Laplace transform to solve the given initial value problem. y???12y??13y=0; y(0)=5, y?(0)= 23 Enclose arguments of...

Advance Math

Q

What type of population growth is shown in this graph A exponential growth U curve...

Biology

Q

05 03 MC What is the area of AABC given mzB 83 a 25 feet...

Basic Math

Q

Consider the function whose graph is plotted below 6 List the x values for which...

Calculus

Q

24 Shireen decided to deposit her graduation gift money in two different savings accounts Account...

Algebra

Q

Rewrite into Radican Form 1813 315 Otx th Simpliky 3 165 4 647 3 2

Algebra

Q

Find the reciprocal of the fraction 30 29 The reciprocal of 30 29 is Type...

Algebra

Q

Question 17 View Policies Current Attempt in Progress Presented here are liability items for Skysong,...

Accounting

1 Answer

$0.99

~~$1.99~~

(Save $1 )

One time Pay

No Ads
Answer to 1 Question
Get free Zin AI - 50 Thousand Words per Month

Best

Unlimited

$4.99*

~~$9.99~~

(Save $5 )

Billed Monthly

No Ads
Answers to Unlimited Questions
Get free Zin AI - 3 Million Words per Month

*First month only

Free

$0

Get this answer for free!
Sign up now to unlock the answer instantly

You can see the logs in the Dashboard.

Sign In

Don't have an account? Sign Up