Skip to content

EricLee911110/AWS_DeepRacer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

AWS_DeepRacer

Testing out the reward function, also work as a journal about how I reach the top 50s in the AWS Student League :D

image

First Attempt

First try on a 10 minutes training.

image

Second Attempt

Try 60 minutes of training on the new method(Point to the tangent of the two closest waypoints). The distances between two waypoints are too far that's why it failed. A great lesson to learn is when you test out a new reward function. You should do an MVP and check out the result. Instead of waiting for it for 60 minutes, 20 minutes will be enough to see how the new method approach.

image image image

For video (link)

Third Attempt

20 minutes of training focusing on the difference between heading and car_next_waypoint_degree. The system will get a heavy penalty(reward *= 0.1) if the difference this larger than the threshold(40 degrees). It turns out that when occurring corners, they think the previous behavior is completely wrong. So the system tends to have a hard right turn even if they should keep turning left.

image image

Fourth Attempt

Same as Attempt_3 but less penalty which turns out the system behaves less sharp turns than the previous model. The system becomes numb to penalties and requires more time to train. Still, this idea won't work.

image Video (link)

Fifth Attempt

The problem with all the attempts I have is that the system won't know what to do if it is approaching the corner. So, I decided to give a penalty if the car stays on the right side of the track in front of a left-turning corner and the same conversely. Another feature is to compare progress-difference between now and 30 seconds before. The system will get more reward if it progressed more than before.

image Video(Link)

YESSS!! It only went off-track for 4 times and I got a really good score of 3:17. It is close to the Udacity nanodegree requirement. Let's keep the good pace going. :)

Fifth_two Attempt

There is three ways can fix the problem of running off-track. First, more reward is the car is cloeser to the center line. Second, more reward staying in either the right side or the left side. Third, more training time.

image

Simply add 10 minutes does help the model to perform better.

Sixth Attempt

Less penalty if the car is on the other side-wanted. 0.7 -> 0.8 Try to be more at the center. Shrink marker_2 0.3 -> 0.25

image Video([link](Uploading output3.mp4…))

After the testing. I really think vague commands really need more time training.

Seventh Attempt

Origin penalty if the car is on the other side-wanted 0.8 -> 0.7 Less reward on marker_2 0.9 -> 0.8

image Video(Link)

I don't know what I did wrong.....

Eighth Attempt

GAS GAS GAS! I'm gonna step on the gas. Clone of Fifth Attempt. Add up the speed detection function.

image Video(link)

Maybe have to train it from the beginning to speed it up

Ninth Attempt

Speed up with progress and avg_speed.

image Video(link)

The run is almost good as Fifth Attempt, but the speed didn't inprove significantly.

Tenth Attempt

Canceled

Eleventh Attempt

Check speed every 3 second and made the reward *1.32. Progress_diff turn up to *1.2 30 mins of training. No penalty while making a right turn.

image Video(link)

Twelfth Attempt

Speed every 2 seconds *1.22 Progress_diff *1.1 More reward sticking toward boarder while turing *1.15. 20 mins training

image Video(link)

Thirteenth Attempt

20 more training with clone of 12th.

image

Video(link)

Fourteenth Attempt

20 mins. Speed every 1 second. Turn left penalty 0.7 -> 0.6. Remove speed up by progress.

image

Fifteenth Attempt

20 more mins training clone of 14th.

image

Sixteenth Attempt

20 more mins training clone 14th.

Seventeenth Attempt

20 more mins training clone 14th. But track_angle 10 -> 5

Eighteenth Attempt

Less reward for being on the right side of the track 1.1 -> 1.05 Init waypoint set to 0 instead of 1. Marker_2 *0.8

Nineteenth Attempt

20 more mins training clone 18th.

image Video(link)

Still have the problem that will keep running off track on the same spot.

Twentieth Attempt

15 mins. Right side rewar 1.05 -> 1.03 Turn left angle 5 -> 18

image

21th Attempt

10 mins Turn left angle 18 -> 0

22th Attempt

Clone of 15th attempt. Right side bonus only on straight line or turning right.

image

23th Attempt

Clone of 22th attempt. Straight line marker_2 0.75 -> 0.72. Left turn distance from center 1.15 -> 1.18

image

24th Attempt

Clone of 23 attempt. Right side rewar 1.03 -> 1.01 Left turn distance from cetner 1.18 -> 1.2

image

25th Attempt

20 mins. marker_1*1.22 marker_2 * 1.2

image Video(link)

Personal Best :D

26th Attempt

20 mins. clone of 25th attempt. Right side reward 1.02. marker2 when turning left 1.21.

image Video(link)

Break personal best again!

27th Attempt

20 mins. Clone of 26th attempt. Right side reward 1.02 -> 1.01. Left side reward turning left 1.0 -> 1.02

image

28th Attempt

20 mins. Clone of 26th attempt. Track_4 *0.6 Track_3 0.1 -> 0.8 marker_1 1.22 -> 1.25

image

29th Attempt

55 mins. Same as 28th, but I think the problem with 28th attempt is that it keeps the memory from the pass. That's why it keeps failing. Last try of this month.

image

Didn't go off track. That is the most happiest thing ever :D Unfortunately, the speed isn't fast enough. The record best record is 03:06.399 If I can reach the endline 7 seconds earlier, I will eligible for the Udacity nanodegree sponsorship. Let's just put this aside and join again next month, you will see me come back June, 2022. Stay tuned. :)

About

Testing out the reward function

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages