ReconCycle Context Action Framework

This package provides a collection of common class definitions that feature extractors (e.g., in vision pipeline) and action predictors should use.

Introduction

The system is split into the following components:

Controller: requests and carries out actions
Context
Action Predictor: provides the get next action service
Action Blocks:
- Cut Block
- Lever Block
- Move Block
- Push Block
- Turn Over Block
- Vice Block
- Vision Block

The system has action blocks, which are high level components, for example: a move block that specifies the start and end positions of a pick-and-place operation and a vision block that gets object positions based on camera images.

The system has a controller that requests the next action to take from the get next action service a.k.a. action predictor. The action predictor doesn't know anything about the controller. The controller is responsible for carrying out the actions.

The action predictor stores the previous predicted actions and whether they were successful. It uses this in a prediction model to determine which action should be performed next given the current context. The action predictor returns the next action to the controller. The system is illustrated in the above diagram.

Programming guideline

This package also serves as a library that provides a collection of common class definitions that should be used over the entire project.

Controller

The controller requests the next action to take from the action predictor. The action predictor returns an action block. The controller carries out the action as described by the action block and returns the context to the action predictor in the form of action details.

The controller is implemented as a FlexBe behaviour.

FlexBE

Several FlexBE states should be created:

Get context should return the current context and pass it to the next state
Get next action (read recommended action) is a wrapper to the action prediction model, and should return the next action given the context

For each of the actions, a separate FlexBe state that calls an appropriate ActionBlock is needed.

Action Predictor

The action predictor is described in detail in it's own readme. The action predictor provides the get next action service and it returns an action block.

Context

The context is defined as the state of the system including the work-cell module that is being operated in, the positions of objects in the system, and the state of the robots and the modules.

These are all specified as enums in types.py.

The modules are:

vision
panda1
panda2
vice
cutter

The robots are:

panda1
panda2

The End effectors are:

soft hand
soft gripper
screwdriver

The Cameras are:

basler
realsense

The Labels of objects are:

hca
hca_empty
smoke_detector
smoke_detector_insides
smoke_detector_insides_empty
battery
pcb
internals
pcb_covered
plastic_clip
wires
screw
battery_covered
gap

The faces of the HCAs and smoke detectors are:

front
back
side 1
side 2

The actions are:

none
start
end
cut
lever
move
push
turn over
vision
vice

These are all specified as enums in types.py.

Device Names

HCA names:

kalo2
minol
kalo
techem
ecotron
heimer
caloric
exim
ista
qundis
enco
kundo
qundis2

Smoke detector names:

senys
fumonic
siemens
hekatron
kalo
fireangel
siemens2
zettler
honeywell
esser

The context action framework provides a function lookup_label_precise_name to get the device name given the device number. See types.py.

ROS Detection Message Format

The context action framework provides the Detection class, defined in types.py. There are two helper functions detections_to_ros and detections_to_py to convert a python Detection object to a ROS Detection.msg and from ROS message back to python object.

Each detection has the following attributes:

id (int): index in detections list
tracking_id (int): unique ID per label that is stable across frames.
label (Label): hca/smoke_detector/battery...
label_face (LabelFace/None): front/back/side1/side2
label_precise (str/None): 01/01.1/02...
label_precise_name (str/None): kalo/minal/fumonic/siemens/...
score (float): segmentation confidence
tf_px (Transform): transform in pixels
box_px (array 4x2): bounding box in pixels
obb_px (array 4x2): oriented bounding box in pixels
center_px (array 2): center coordinates in pixels
polygon_px (Polygon nx2): polygon segmentation in pixels
tf (Transform): transform in meters
box (array 4x3): bounding box in meters
obb (array 4x3): oriented bounding box in meters
center (array 3): center coordinates in meters
polygon (Polygon nx3): polygon segmentation in meters
obb_3d (array 8x3): oriented bounding box with depth in meters
parent_frame (str): ROS parent frame name
table_name (str/None): table name of detection location
tf_name (str): ROS transform name corresponding to published detection TF

Action Blocks

An action block is a high level specification of an operation that can be carried out on the Reconcycle cells. The action block can be a physical movement, an information extractor from the physical environment, or a combination of the two.

Action blocks are high level blocks and an action block can consist of multiple actions, for example, the cut block moves an object into the cutter, and then the cutter is activated to cut the object.

Cut Block

The cut block should specify the initial position of the object and the cutter module, where the object is to be cut.

CutBlock.msg:

enum from_module
Transform from_tf
enum to_module
Transform to_tf
array obb_3d
enum robot
int end_effector

CutDetails.msg:

bool success

Lever Block

The lever block should specify from where to where to carry out the levering action and with which end effector and robot.

LeverBlock.msg:

enum module
Transform from_tf
Transform to_tf
array obb_3d
enum robot
enum end_effector

LeverDetails.msg:

bool success

Move Block

The move block specifies the start and end positions of an object and which end effector and robot should do the moving.

MoveBlock.msg:

enum from_module
Transform from_tf
enum to_module
Transform to_tf
array obb_3d
enum robot
enum end_effector

MoveDetails.msg:

bool success

Push Block

The push block specifies the start and end positions of the pushing action and with which robot and end effector the push action should be carried out with.

PushBlock.msg:

enum module
Transform from_tf
Trnasform to_tf
array obb_3d
enum robot
enum end_effector

PushDetails.msg:

bool success

Turn Over Block

The turn over block specifies the position and 3d oriented bounding box of the object that should be picked up, rotated 180 degrees, and placed down again, with the specified robot and end effector.

TurnOverBlock.msg:

enum module
Transform tf
array obb_3d
enum robot
enum end_effector

TurnOverDetails.msg:

bool success

Vice Block

The vice block specifies whether the vice should clamp and turn over or only clamp, or only turn over.

ViceBlock.msg:

enum module
bool clamp
bool turn_over

ViceDetails.msg:

bool success

Vision Block

The vision block specifies whether gap detections should be carried out, which camera to use and above which module. Gap detection is only possible with the realsense camera and also only the realsense camera can be moved to a specified position.

The gap detection is useful for levering actions. The parts detection is useful for moving actions. All coordinates of parts are given in world coordinates with respect to the module.

The parts detection uses a neural network called Yolact for parts segmentation. It uses a kalman filter for tracking and reidentification.

The gap detection uses the depth image and a classical clusturing approach to determine gaps in the device.

The vision details are a list of detections and gaps (if gap detections were requested and available).

VisionBlock:

enum camera
enum module
transform tf
bool gap_detection

VisionDetails:

bool gap_detection
Detection[] detections
Gap[] gaps

A detection is defined as the whole or part of a device.

Detection:

int id
int tracking_id
enum label
float score
Transform to_px
array box_px
array obb_px
array obb_3d_px
Transform tf
array box
array obb
array obb_3d
array polygon_px

Gap:

int id
Transform from_tf
Transform to_tf
float from_depth
float to_depth
array obb
array obb_3d

Camera Position

The camera position needs to be known such that we can transform from image coordinates to world-coordinates relative to the module we are looking at.

The camera position can be fixed or mounted to the robot hand just above the end-effector.

When the camera is fixed, the world coordinates are determined by the position of the work surface in the image.

When the camera is mounted to the robot, the extrinsic position of the camera is determined by the robot transform and the hand-eye transform. The position of the object is then calculated based on camera intrinsics and distance of object from camera. Without the depth it is not possible to determine the position of an object when the object dimensions are unknown.

More Tactile Skills to come...

Currently, we only have vision. In the future, this will be extended to tactile skills as well.

reconcycle / context_action_framework Goto Github PK

context_action_framework's Introduction

ReconCycle Context Action Framework

Introduction

Programming guideline

Controller

FlexBE

Action Predictor

Context

Device Names

ROS Detection Message Format

Action Blocks

Camera Position

More Tactile Skills to come...

context_action_framework's People

Contributors

Watchers

context_action_framework's Issues

Recommend Projects

Recommend Topics

Recommend Org