3D Printed Soft Robotic Glove for Stroke Rehabilitation Equipped With Q-Learning for Sensing and Control.
According to the WHO, 15 million people suffer from a stroke worldwide each year.
Of these, 5 million die, and another 5 million are permanently disabled. This makes stroke the 2nd leading cause of death, responsible for approximately 11% of total deaths globally.
A stroke occurs when blood supply to a part of the brain is blocked or reduced. This can be due to a blockage in a blood vessel, or a ruptured blood vessel.
How you’re affected by a stroke depends on: its location, it’s severity, and how quickly you receive treatment.
Stroke patients often have a reduced ability to engage in activities of daily living such as grasping different objects due to motor impairment. The development of rehabilitative devices is needed to restore hand function in these patients.
Studies have shown that only 18% of patients with severe upper limb impairment will attain full recovery of their motor functions and although task-oriented rehabilitation exercises have been shown to be able to maximize patient recovery outcomes, high costs of rehabilitation prevent stroke patients from accessing quality care.
There is a “Golden period” with optimal recovery of hand function which occurs typically within the first 3 months after a stroke. We want to be able to improve hand motoric functions within this period through identical and repetitive movements by the impaired hands, as the brain reorganizes (neuroplasticity) after stroke, to recover motoric function.
Increasing functional independence in hand grasping can be approached in many ways through rehabilitation, through daily or periodic exercises, or by using assistive devices. Some of the most common approaches used today are:
Efforts by occupational therapists are often labour-intensive, costly and inefficient.
However, existing devices have disadvantages including heavyweight and large size. Most of them cannot provide aid to activities of daily living, such as hold and pick simple objects, which is a challenging task for stroke patients due to post-stroke spasticity.
End effector devices:
The end effector is external to the body of patients, and it provides the required force to the end of the user’s extremity to help or resist the motion. The end effector provides force without considering the individual joint motions of the patients’ limbs, which bring problems such as the limited range of motion and the dead point issues. The end effector is also not portable which is why there is limited practical use in clinic.
Hand exoskeleton rehabilitation:
Different from the end effector, an exoskeleton device can be worn on the body of patients. The joint and links of the robot have a direct correspondence with the human joints and limbs making them more ideal. This portability of exoskeleton makes it a good choice for stroke rehabilitation, especially for patients in the later period of stroke when they can train themselves at home instead of always having to be in the clinic. For these reasons, I’ll be focusing on breaking down existing exoskeleton devices and problems that exist with them such as the robot axes have to be aligned with the anatomical axes of the hand, and these devices are still very heavy.
All of these methods aim to do upper extremity asessement for which the key outcome metrics measured are:
- Motor Function: assess gross motor movements and a series of general impairment measures when using the upper extremities.
- Global Stroke Severity: assess the severity of stroke through global assessment of deficits post stroke.
- Muscle Strength: assess muscle power and strength during movement and tasks.
- Dexterity: assess fine motor and manual skills through a variety of tasks, particularly with the use of the hand.
- Range of Motion: assess ability to freely move upper extremity at joints both passively and actively.
- Proprioception: assess bodily sensory awareness and location of limbs.
- Activities of Daily Living: assess performance and level of independence in various everyday tasks.
- Spasticity: assess the tone of muscles controlled by signals from the brain. If the part of your brain that sends these control signals is damaged by a stroke, then the muscle may become too active.
Existing Barriers With Exoskeleton Devices
Hand rehabilitation exoskeletons are in need of improving key features such as simplicity, compactness, bi-directional actuation, low cost, portability, safe human-robotic interaction, and intuitive control.
With the participation of human, safety becomes the top priority for the design of rehabilitation robotic hands. All mechanical properties have to be designed similarly to the human hand for upper extremity exoskeletons so that no serious harm is done to the human body.
Complexity, Availability and Cost:
The motor recovery training is long term and costly to most patients. The main limitation to the practical use of hand rehabilitation robotics is the availability which is limited due to these devices being expensive (material & developement costs) and complex.
Most complex rehabilitation robotic devices today require supervision from specially equipped therapists. They are designed for in-clinic use and are generally not portable for this reason. Current hand rehabilitation robots also offer 1 degree of freedom (DOF) for a single finger while fingers have far more DOF, since any extra DOF would bring considerable extra expense.
Traditional hand exoskeleton devices are limited by there high cost, rigidity, weight and constraint on the joints’ non-actuated degrees of freedom (DOFs) pose complications. These problems stem from their components; linear actuators and rigid linkages.
Big equipment for rehabilitation is also very expensive, and the burden is too heavy for the hospitals to provide enough room and equipment for stroke patients.
- Powered lower and upper extremities exoskeletons sell for $70,000 — $120,000 each on average and can weight upwards of 51 lbs.
For example, The ReWalk, the first exoskeleton approved by the FDA, costs between $69,000 and $85,000 and weighs 51 lbs.
There is some initative to develop cheaper exoskeletons such as The Phoenix ful body exoskeleton for mobility assistance costs $40,000 and sells industrial exoskeleton modules for $4000–5000 a piece.
However, on average most exoskeletons are still expensive and researchers anticipate a reduction from the current price of approximately $45,000 per exoskeleton is needed to make it more affordable. A lot of the other barriers such as intuitive control, safe human-robotic interaction and heavy devices still exist.
Soft robots can be used to solve a lot of the challenges faced with traditional exoskeletons. They have the advantages of higher flexibility, safer operations, lightweight and simplified production; resulting in lower manufacturing cost.
They are also capable of generating forces and torques to support bidirectional finger movements, consisting of finger flexion and extension. They control finger extension by the very finite elasticity of elastomeric actuators upon fluid depressurization.
Soft Robotics have been applied to many problems such as: a) Multi-gait soft robot crawling under an obstacle. b) Camouflage and display for soft machines. c) GoQBot is a caterpillar-inspired rolling robot capable of fast speeds. d) Pneumatic networks (PneuNets) for soft robots that rapidly actuate. e) Soft robotic glove for combined assistance and at-home rehabilitation. f) Hydraulic autonomous soft robotic fish for 3D swimming.
3D Printed Soft Robots
Early 3D printing technologies were limited to rigid materials, typically made from hard plastics. Today, additive manufacturing enables the rapid design and fabrication of soft robotics that can further reduce the cost of manufacutring/developement of hand exoskeletons.
Some of the key techniques that can be used for extrusion-based and powder-based 3D printing of soft actuators include:
- In fused deposition modelling (FDM) — a) and b): a thermoplastic filament is heated (ΔT) by an extrusion head and pushed through an extrusion nozzle to generate pneumatic actuators capable of lifting a 3.2kg chair.
- Direct ink writing (DIW) — c) and d): of composite hydrogel inks using pressure.
- Selective laser sintering (SLS) — e) and f): of thermoplastic polyurethane (TPU) powders to create a monolithic pneumatically actuated hand capable of safely interfacing with humans.
Drawbacks of Soft Robotics
Most soft robotic gloves today mainly provide flexional movement to grasp objects, but can not produce sufficient extensional force for stroke patients who are unable to extend their fingers because of spasticity. This leads to a situation in which a stroke survivor is able to grasp an object but can not release it.
Some fabric-based bidirectional soft robotic gloves have been designed to offer active assistance with both finger flexion and extension. However, the overall size of these bidirectional actuators ends up being much greater, making the system more inconvenient and uncomfortable to wear.
Soft Robotic Hand for Stroke Rehabilitation
We need a new soft robotic glove that is designed to assist both flexional and extensional movement while retaining a small size (15 mm in thickness or less) by incorporating an elastic torque-compensating layer into the soft actuator.
This design can provide flexional torque on pressurization, and the compensating layer can provide extensional torque on depressurization. It also exhibits a significantly reduced degree of nonlinearity compared with a purely soft actuator which can help to do dynamic modeling of the actuator.
This soft robotic hand will have many benefits some of which are:
- more degrees of freedom and a larger range of motion.
- low component cost due to inexpensive materials (e.g. fabrics, elastomers, etc.).
- safe human-robotic interaction due to the soft and compliant materials used for their fabrication.
Control for Manipulation
Soft robots have a limited output force whereas soft robotic hands need to be able to supply a strong force when grasping various objects in their role. The development of control algorithms remains a challenge for soft robots because of the nonlinearity of soft actuators and their interaction with the environment.
Existing controllers proposed for the manipulation of soft robots can be categorized into three types:
- Model-free controllers: usually based on machine learning techniques or empirical methods. Such controllers have great advantages in highly nonlinear, non-uniform, and unstructured environment situations where modeling is almost impossible.
- Model-based controllers: usually need analytical models to derive the controller. These controllers have more accurate and reliable performance than model-free controllers for uniform soft manipulators in known environments. However, they usually require well-defined dynamical models for the soft robots, which may not be easy to construct based on rigid-body assumptions.
- Hybrid controllers: combine model-free and model-based controllers, and are usually based on an analytical model to capture the main part of the system’s intrinsic properties and a data learning model to compensate dynamic uncertainties.
The architecture of the pneumatic control setup is as follows:
These soft-bodied agents can be controlled with the aid of Reinforcement Learning (RL) by making machines that are able to execute and identify optimal behavior in terms of a certain reward (or loss) function. This approach is highly motivated by what is presented in this paper by the University of Science and Technology of China.
A set of predetermined points to represent the state of the soft robot can be used to set the states. The state representation of the entire soft robot would just be the combination of the states of each section. A section state is represented by three coordinates: the beginning, the center and the end.
The action is defined as the movement of one of the motion sections in discrete directions.
The reward function can be defined as a linear correlation function of the states.
I use a function approximation Q-Learning to train the control policy. Each episode, the soft robot executes a sequence of actions from the initial state. On the basis of the above state, action, reward representation, a “final action” is introduced. This action does not affect the shape of the robot, but indicating this episode will be stopped. So, the entire learning process consists of repeating episodes from the initial state to the “final action”.
Because of the continuous space and nonlinear setting of the problem, I map the state space into a high-dimensional linear space. This data structure can be seen as a neural network with only one hidden layer. This data structure is then fit to the Q(s, a) function.
The Q function is the mean value function associated with the step numbers:
I simulated the control of a soft robot gripper using a function approximation Q-learning to train the control policy for a simple manipulation task. This was done with SOFA using the Soft Robotics Toolkit.
The simulator can also be used to sample in the state space before any reinforcement learning. You can randomly execute actions and compute the Q value in each sample. By sampling the state space, you can use the gradient descent algorithm to compute the weight of this network. And use the Q(s, a) function trained in this way as the initial Q in the simulation step to reach better performance.
For closed-loop control from each time you obtain the current state of the soft robot from the actual sensor. You then input this state into the Q function to get and execute the optimal action of the policy.
A soft robot may interact with a wide variety of objects in the environment, where some of the information of these objects and the robot itself cannot be effectively perceived. For example, hardness or weight of an object that the robot tries to pick up can potentially interfere with the interaction.
Sensor characterization can help with this as it involves collecting raw sensor data aligned with ground truth in a controlled and monitored environment (e.g., lab setting). This data is then used as input to a neural network to predict further values such as force (N), pressure (Pa), strain (%) and location (mm).
In contrast, systems characterization collects sensor data in a less controlled environment that mimics their use witin the field or real-time (less controlled) environment. In this case, ground truth measurements such as force are more difficult to obtain. Therefore, users often circumvent this by mapping to higher-level classifications, such as grasp success and texture recognition.
3D printed soft robotics and Q-learning approaches for control will allow for us to develop hand exoskeletons that have lower cost, are portable, and have intuitive control compared to existing exoskeletons.
Further work in this space includes developing techniques for individual finger control and a more solid theoretical guidance, analysis, of the value function or policy migration between the simulation and the reality.