Baxter the Mailman

A video illustrates what is our project:



Final Report:
1)      Introduction
a.       Describe the end goal of your project.
We hope to achieve mail sorting automation using the Baxter. In other words, we expect the Baxter to replace an experienced staff in a local mail station for mail sorting jobs. Hence, the Baxter should perform mail catching when someone hands him a mail or a small package, sort the mail according to handwritten zip codes on the surface of the mails, and finally pile up the mails based on size.
b.      Why is this an interesting project? What interesting problems did you need to solve to make your solution work?
As the eCommercial concept gains popularity in recent years, we believe that online shopping would occupy a large share of the market in the future. Hence, the problem of delivery goods stands out. The current efficiency of the delivery industry will not suffice the growing online shopping demands, and will further ruin customer experience if delivery takes too long. Therefore, we think using robots to perform mail sorting automation would significantly increase efficiency.
The problems that are involved with this project includes Computer Vision for mail recognition and zip code contour recognition, Machine Learning for mail sorting, camera sensing for AR tag tracking and path planning and algorithm development for piling up.
c.       In what real-world robotics applications could the work from your project be useful?
For the mail industry. The Baxter could work as a staff in a local mail store, so that it can catch the mails or the packages from the customers as they hand them to him. This will allow high efficiency in the client part of the industry. Furthermore, similar automated sorting processes can also be applied to libraries or warehouses.
2)      Design
a.       What design criteria must your project meet? What is desired functionality?
The project is designed to include sensing, path planning and actuation. In more detail, our initial design is to use the Baxter robot to build a human-robot interactive program where the robot’s job includes:
·         Sensing the position of the human using Kinect
·         Moving the Baxter’s arms to get the mails from the human
·         Recognizing the hand written zip codes on the mails
·         Putting the mails to the corresponding regions
·         Pilling up the mails according to their sizes
b.      Describe the design you chose.
The final design of the mail sorting automation project includes 3 steps. The robot first locates the handed mail. We use AR tags to denote the size of the mails for tracking, and we set up the camera in the Baxter's arm so that it can read information from the AR tags to plan a path and catch the mail. And then the robot puts the mail in the correct region according to the handwritten zip code on the mail. Machine Learning algorithms are adopted to first recognize the contours of the digits and then classify the handwritten digits. The classification model is a Neural Network trained on the MNIST dataset. At last, after all the mails have been sorted, the Baxter piles each heaps so that the order of each pile is well-organized: biggest mails at the bottom and smallest mails on the top. Here we developed a customized algorithm to solve the sorting problem.
c.       (And for part d.)What design choices did you make when you formulated your design? What tradeoffs did you have to make? How do these design choices impact how well the project meets design criteria which would be encountered in a real engineering application, such as robustness, durability and efficiency?
First, we decided to use AR tags to track the mail locations. This will be unrealistic in the true mail station setting. A better approach is to use Computer Vision to determine mail locations and use Kinect to track human location. However, Kinect implementation in real setting is too complicated and costly. Kinect is also not as robust as AR tags in object locating. To improve the efficiency and the efficacy of the entire process, we ended up using AR tags and regular cameras.
Second, in the pile-up process, we decided to have preset locations for the mail piles and avoided reading AR tags repeatedly. This significantly reduces the time the robot needs to locate the mails and compute the trajectories for the pile-up phase. It also avoids the problem that the AR tags maybe blocked by the camera view, which could potentially impede the entire process.
Third, we used the IR sensor from the robot's arm to determine if the mail is close enough to the gripper. This design allows us to get closed-loop feedback on the distance between the gripper and the mail. It improves the robustness and the durability of mail capturing.
Four, we first wanted to use clippers to capture the mails. Soon we figured that it would involve moving both arms of the robot and it would not only increase the time for path planning but also have a greater chance of not catching mails. Therefore, suction cup was adopted instead to improve the mail capturing experience. Results demonstrate effectiveness of suction cup over clippers.
Five, a collision object (table) was added to the environment to avoid potential collision between the robot and the table. The table size was set to be smaller than the actual table to make pile-up work. Imagine having the actual table size as the collision object, pile-up would fail every time because the gripper would find a collision when trying to pick up a mail from the table. Adding the approximately collision object successfully reduces the number of times the robot collide with its surrounding environment.
3)      Implementation
a.       (And b.)Describe any hardware you used or built. Illustrate with pictures and diagrams. What parts did you use to build your solution?
In the project, we are mainly using the Baxter robot and its accessory kit.
The Baxter Robot(full view):



Camera with max 1280X800 resolution located in the Baxter's left arm is used for locating the mails and providing pictures for digit recognition.

A suction cup is installed at the end of the Baxter's left arm for the purpose of capturing mails.

The screen on the Baxter's head is used to display the output image of the digit recognition.

The IR sensor on the Baxter's arm is used to determine the distance between the gripper and the mails.

b.      What parts did you use to build your solution?(see above)
c.       Describe any software you wrote in detail. Illustrate with diagrams, flow charts, or other appropriate visuals. This includes launch files, URDFs, etc.
Source files: these files ensure that the dependency and the path of the project is correct.
Mailman.py: This file is the main frame of the working project. It contains human-computer interactive sessions, in which the program takes in the human's order to perform different tasks.
The first session is the handover part. In this session, the program tracks the TF messages rendered by the AR tags, and reacts immediately to begin path planning. We implement the MoveIt! kit to generates a plausible path by inverse kinetics. After the path has been calculated, the gripper will move to the destination, and the IR sensor will detect if the distance between the gripper and the mail is close enough for the suction cup to catch the mail. If not, the gripper will automatically adjust its pose by move forward a little bit until it reaches the mail.
The second session is digit recognition. When catching the package, the camera located on the Baxter's arm captures and saves a picture of the mail for the program to perform digit recognition. If the output digits match a preset region, the Moveit! Kit will calculate a path for the arm to move the mail to the corresponding region.
The third session is pile up. After all the mails have been assorted to different regions, the program will first fetch a region to begin pile up. In the mail catching process, the program records the order of mails in each pile. We coded the mail size information on the AR tags, so that the sorting process is based on AR tags. Our algorithm allows three temporary zones on the table, and will perform Hanoi-Tower-like sorting. After all the piles have been sorted, the program ends.
d.      How does your complete system work? Describe each step.

4)      Results
a.       How well did your project work? What tasks did it perform?
(Calculation of rates were based on the video we filmed where we perform the entire process for multiple times.)
Our project involves five tasks:
1. Path planning:
We use the Moveit! kit and Baxter simulator as main tools to calculate the path. To help the path planning program run stably, the preset poses of Baxter in our program have been carefully calibrated, and the poses enormously reduced the failure rate of generating a valid path. Our project was carried out on the lab computer with 2 GiB memory.
Success rate
87%

2. Contour recognition:
Contour recognition is a complicated task because any other object in the frame may be recognized as a valid contour. Therefore, in order to reduce the contour recognition accuracy, we devised a specific pose for human to hand the mail and the handwritten digit sizes are consistent throughout all the sample mails for this project. Fortunately, in this simplified setting, contour recognition gives reasonably well performance.
Success rate
82%

3. Handwritten digit recognition:
For this part, we used Machine Learning to recognize digits, and we encountered a problem. The handwritten digit recognition works very well with all other digits except "9".
It was because in our hand-writing, we usually choose the neglect the hook part on the bottom of the digit "9", causing the program to fail and output "1". This problem could be fixed by writing "9" in a print style, but that's definitely not what we want. It stands oppositely on the original purpose of handwritten digit recognition. To solve this issue, we considered the situation in real life, that the Baxter has limited arm length and hence could sort mails with limited range of zip codes. Therefore, we decided only to track last three digits of the zip codes, reducing the failure rate of the Baxter.
General success rate
74%
Failure rate on digit "9"
89%

4. Suction cup + IR closed loop sensing:
The suction cup and IR closed loop sensing are originally built in the Baxter, and we discovered that they worked stably and almost never failed.
Success rate of suction cup
99%
Success rate of IR sensing
99%

4. hand-over:
The hand-over part is a combo of all above processes. Except from the problems mentioned above, we encountered another problem-- the Baxter arm will overmove and hit the table(or wall/wires on the robot).We soon realized it was a severe issue and decided to recalibrate the configuration of the Baxter. Fortunately after reconfiguration, the failure rate was significantly reduced.
Success rate
(Every single handover is counted)
45 out of 78
Path not found failures
5 out of 78
Collision failures
28 out of 78

5. Pile-up:
In the pile up process, the algorithm we developed works well, in the sense that the robot can figure out the correct pick up and drop off locations for each mail. However, like hand-over process, we still met the problem of overmove. To resolve this, we carefully calibrate the size of the table, ensuring a successful path should be found as well as the path does not collide with the real table.
Success rate(counting each mail move)
102 out of 136
Overmove failures
34 out of 136
b.      Illustrate with a video and pictures.

5)      Conclusion
a.       Discuss your results. How well did your finished solution meet your design criteria?
Despite the data listed above, the success rate is in fact incremented as we continuously fixed bugs and recalibrated the Baxter. On the last 3 tests, our process ran more smoothly and successfully avoiding most kinds of bugs.
Here is a form discussing each design criteria.
Sensing
For mail locating by camera, our implementation--AR tags and TF messages provide messages rendered to path planning and digit recognition. The AR tags also provides the possibility for the robot to interact with human. The IR sensing was adopted to carefully measure the distance between the object and the gripper, and to ensure the suction cup can work properly.
Moving
We are using the pre-installed Moveit! Kit to perform path planning. Despite some unexpected failures(i.e. hit table, arm entangled with wires), as we reconfigure and recalibrate the robot, its arm moves smoothly and the gripper is able to catch the mail from human as well as pick up mail from the table.
Recognizing
We adopted Machine Learning techniques and used the picture rendered by the camera. Although the algorithm failed to recognize the digit "9", all other digits were correctly classified to mostly match the data we preset.
Putting
The regions were set manually in order to ensure a path could be found. Since we can get a region-matching zip code from the recognition part, the Moveit! kit can calculate a path avoiding collisions and the arm can accurately put the mail to the corresponding region.
Path Planning
The data above shows that we encountered some difficulties using the Moveit! kit to calculate the path. In the process, almost every step requires a path for the Baxter's arm to move, and hence it is the most important part of the project. We adopted all kinds of implementations in order to increase the success rate of path planning, i.e. tuning parameters of the Moveit! Kit and preset poses of arms. Fortunately, the last few tests showed that path planning worked stably without any failures. The finished solution ensures robust user experience with the Baxter.
b.      Did you encounter any particular difficulties?
As mentioned above, we encountered problems during the path-planning phase. At the start, the path planning tool—Moveit! kit kept failing and could not return a valid path. We considered about all surroundings that will influence the path plan and also thought about the initial configuration of the program as well as the Baxter’s arm. And later we discovered that if we gave the program more time and attempts, it will more likely to find a valid trajectory for the arm to move, so we set the runtime of the program to be longer than original. Also, we discovered that in the piling up phase, the arm refused to move close to the table because the program rejected the paths that collide with the obstacles, and hence we set the size of the table to be smaller than actual.
And then in the real movement, the arm kept hitting the table despite that the path shown by the program avoided the obstacle. Another group helped us recalibrate the Baxter’s arm and reconfigure the settings. Thanks to their help we were finally able to avoid this issue.
Also, we encountered a problem during the digit recognition. The digit “9” could not be recognized because in our handwriting, we neglected the hook of the digit on the bottom. This causes the program to read “9” as “1”. This problem could be fixed by writing “9” in a print style, but that just stands on the opposite point of handwritten digit recognition. Adding some of our “9”s in the training set may help the classifier to learn better, however, this would not be a valid approach in the real mail sorting setting. So we decided to track only the last three digits of the zip codes. This is closer to the real life engineering setting because a robot will not be able to sort all the zip codes in a mail sorting facility. Instead he will only sort a range of zip codes, and the front 2 digits will all be the same for a region. The last three digits helped a lot in increasing the success rate.
c.       Does your solution have any flaws or hacks? What improvements would you make if you had additional time?
Preset poses were set to avoid path planning failures. In this part we hard-coded the poses, but a better and more general approach could be adopted where we don't need to preset any poses and the Baxter will decide himself the regions to drop mails.
We still have difficulty recognizing the mail size using Computer Vision since the mails can be handed to the robot from different distances, so we used AR tag with coded sizes. In the future, in the image processing process we might be able to find a formula to compute the mail size by the contour of the mail and the IR sensing of distance, so that we will be able to forsake the AR tags to simplify the hand-over process.
The digit recognition kept failing on digit "9". In the future we might be able to fix it by adding more training samples to distinguish digits "1" and "9".
6)      Team

a.       Names and short bios of each member of your project
Mo Zhou – She is a graduate student in IEOR, under the supervision of Ken Goldberg. She has taken courses like CS 189, Stats 215A, etc and has been working on Machine Learning projects for research.
Jiacheng Wu –She is a computer science major and has taken courses like CS 170, EE 120, etc…
Chunyu Hou -- She is an aerospace engineering major and she is an exchange student from Harbin Institute of Technology.
Mingyi Zheng - He is a Mechanical Engineering major and has taken courses EE C128 and has experience working on Kinect object tracking.
7)      Additional Materials
a.       Code, URDFs, and launch files you wrote
b.      CAD models for any hardware you designed
Not Applicable.
c.       Datasheets for components used in your solution
Since we did not use any additional hardware, here we only attach the datasheets of the Baxter and its accessory kits (provided by Rethink).
Here are the links:

d.      Any additional videos, images, or data from your finished solution.

Video recording general mistakes: https://youtu.be/S8O9cBu0TIQ
Video specifying the mistakes happened in different phase: https://youtu.be/WSiT7Dx1nU8
e.       With the help of Professor Ruzena Bajcsy and all three GSIs (Victor Shia, Robert Mathew and Jaime Fernandez-Fisac), also the help from other Baxter groups (they carefully reconfigure the Baxter so that the Baxter can move more accurately), we finally finished this project. We would like to express our sincere gratitude to them.