Giter VIP home page Giter VIP logo

context_action_framework's Introduction

ReconCycle Context Action Framework

This package provides a collection of common class definitions that feature extractors (e.g., in vision pipeline) and action predictors should use.

Introduction

The system is split into the following components:

  • Controller: requests and carries out actions
  • Context
  • Action Predictor: provides the get next action service
  • Action Blocks:
    • Cut Block
    • Lever Block
    • Move Block
    • Push Block
    • Turn Over Block
    • Vice Block
    • Vision Block

The system has action blocks, which are high level components, for example: a move block that specifies the start and end positions of a pick-and-place operation and a vision block that gets object positions based on camera images.

The system has a controller that requests the next action to take from the get next action service a.k.a. action predictor. The action predictor doesn't know anything about the controller. The controller is responsible for carrying out the actions.

Diagram

The action predictor stores the previous predicted actions and whether they were successful. It uses this in a prediction model to determine which action should be performed next given the current context. The action predictor returns the next action to the controller. The system is illustrated in the above diagram.

Programming guideline

This package also serves as a library that provides a collection of common class definitions that should be used over the entire project.

Controller

The controller requests the next action to take from the action predictor. The action predictor returns an action block. The controller carries out the action as described by the action block and returns the context to the action predictor in the form of action details.

The controller is implemented as a FlexBe behaviour.

image

FlexBE

Several FlexBE states should be created:

  • Get context should return the current context and pass it to the next state
  • Get next action (read recommended action) is a wrapper to the action prediction model, and should return the next action given the context

For each of the actions, a separate FlexBe state that calls an appropriate ActionBlock is needed.

Action Predictor

The action predictor is described in detail in it's own readme. The action predictor provides the get next action service and it returns an action block.

Context

The context is defined as the state of the system including the work-cell module that is being operated in, the positions of objects in the system, and the state of the robots and the modules.

These are all specified as enums in types.py.

The modules are:

  • vision
  • panda1
  • panda2
  • vice
  • cutter

The robots are:

  • panda1
  • panda2

The End effectors are:

  • soft hand
  • soft gripper
  • screwdriver

The Cameras are:

  • basler
  • realsense

The Labels of objects are:

  • hca
  • hca_empty
  • smoke_detector
  • smoke_detector_insides
  • smoke_detector_insides_empty
  • battery
  • pcb
  • internals
  • pcb_covered
  • plastic_clip
  • wires
  • screw
  • battery_covered
  • gap

The faces of the HCAs and smoke detectors are:

  • front
  • back
  • side 1
  • side 2

The actions are:

  • none
  • start
  • end
  • cut
  • lever
  • move
  • push
  • turn over
  • vision
  • vice

These are all specified as enums in types.py.

Device Names

Diagram

HCA names:

  1. kalo2
  2. minol
  3. kalo
  4. techem
  5. ecotron
  6. heimer
  7. caloric
  8. exim
  9. ista
  10. qundis
  11. enco
  12. kundo
  13. qundis2

Smoke detector names:

  1. senys
  2. fumonic
  3. siemens
  4. hekatron
  5. kalo
  6. fireangel
  7. siemens2
  8. zettler
  9. honeywell
  10. esser

The context action framework provides a function lookup_label_precise_name to get the device name given the device number. See types.py.

ROS Detection Message Format

The context action framework provides the Detection class, defined in types.py. There are two helper functions detections_to_ros and detections_to_py to convert a python Detection object to a ROS Detection.msg and from ROS message back to python object.

Each detection has the following attributes:

  • id (int): index in detections list

  • tracking_id (int): unique ID per label that is stable across frames.

  • label (Label): hca/smoke_detector/battery...

  • label_face (LabelFace/None): front/back/side1/side2

  • label_precise (str/None): 01/01.1/02...

  • label_precise_name (str/None): kalo/minal/fumonic/siemens/...

  • score (float): segmentation confidence

  • tf_px (Transform): transform in pixels

  • box_px (array 4x2): bounding box in pixels

  • obb_px (array 4x2): oriented bounding box in pixels

  • center_px (array 2): center coordinates in pixels

  • polygon_px (Polygon nx2): polygon segmentation in pixels

  • tf (Transform): transform in meters

  • box (array 4x3): bounding box in meters

  • obb (array 4x3): oriented bounding box in meters

  • center (array 3): center coordinates in meters

  • polygon (Polygon nx3): polygon segmentation in meters

  • obb_3d (array 8x3): oriented bounding box with depth in meters

  • parent_frame (str): ROS parent frame name

  • table_name (str/None): table name of detection location

  • tf_name (str): ROS transform name corresponding to published detection TF

Action Blocks

An action block is a high level specification of an operation that can be carried out on the Reconcycle cells. The action block can be a physical movement, an information extractor from the physical environment, or a combination of the two.

Action blocks are high level blocks and an action block can consist of multiple actions, for example, the cut block moves an object into the cutter, and then the cutter is activated to cut the object.

The cut block should specify the initial position of the object and the cutter module, where the object is to be cut.

CutBlock.msg:

  • enum from_module
  • Transform from_tf
  • enum to_module
  • Transform to_tf
  • array obb_3d
  • enum robot
  • int end_effector

CutDetails.msg:

  • bool success

The lever block should specify from where to where to carry out the levering action and with which end effector and robot.

LeverBlock.msg:

  • enum module
  • Transform from_tf
  • Transform to_tf
  • array obb_3d
  • enum robot
  • enum end_effector

LeverDetails.msg:

  • bool success

The move block specifies the start and end positions of an object and which end effector and robot should do the moving.

MoveBlock.msg:

  • enum from_module
  • Transform from_tf
  • enum to_module
  • Transform to_tf
  • array obb_3d
  • enum robot
  • enum end_effector

MoveDetails.msg:

  • bool success

The push block specifies the start and end positions of the pushing action and with which robot and end effector the push action should be carried out with.

PushBlock.msg:

  • enum module
  • Transform from_tf
  • Trnasform to_tf
  • array obb_3d
  • enum robot
  • enum end_effector

PushDetails.msg:

  • bool success

The turn over block specifies the position and 3d oriented bounding box of the object that should be picked up, rotated 180 degrees, and placed down again, with the specified robot and end effector.

TurnOverBlock.msg:

  • enum module
  • Transform tf
  • array obb_3d
  • enum robot
  • enum end_effector

TurnOverDetails.msg:

  • bool success

The vice block specifies whether the vice should clamp and turn over or only clamp, or only turn over.

ViceBlock.msg:

  • enum module
  • bool clamp
  • bool turn_over

ViceDetails.msg:

  • bool success

The vision block specifies whether gap detections should be carried out, which camera to use and above which module. Gap detection is only possible with the realsense camera and also only the realsense camera can be moved to a specified position.

The gap detection is useful for levering actions. The parts detection is useful for moving actions. All coordinates of parts are given in world coordinates with respect to the module.

The parts detection uses a neural network called Yolact for parts segmentation. It uses a kalman filter for tracking and reidentification.

The gap detection uses the depth image and a classical clusturing approach to determine gaps in the device.

The vision details are a list of detections and gaps (if gap detections were requested and available).

VisionBlock:

  • enum camera
  • enum module
  • transform tf
  • bool gap_detection

VisionDetails:

  • bool gap_detection
  • Detection[] detections
  • Gap[] gaps

A detection is defined as the whole or part of a device.

Detection:

  • int id
  • int tracking_id
  • enum label
  • float score
  • Transform to_px
  • array box_px
  • array obb_px
  • array obb_3d_px
  • Transform tf
  • array box
  • array obb
  • array obb_3d
  • array polygon_px

Gap:

  • int id
  • Transform from_tf
  • Transform to_tf
  • float from_depth
  • float to_depth
  • array obb
  • array obb_3d

Camera Position

The camera position needs to be known such that we can transform from image coordinates to world-coordinates relative to the module we are looking at.

The camera position can be fixed or mounted to the robot hand just above the end-effector.

When the camera is fixed, the world coordinates are determined by the position of the work surface in the image.

When the camera is mounted to the robot, the extrinsic position of the camera is determined by the robot transform and the hand-eye transform. The position of the object is then calculated based on camera intrinsics and distance of object from camera. Without the depth it is not possible to determine the position of an object when the object dimensions are unknown.

More Tactile Skills to come...

Currently, we only have vision. In the future, this will be extended to tactile skills as well.

context_action_framework's People

Contributors

sebastian-ruiz avatar bkuster0 avatar smihael avatar reconcycle-deploy avatar matijamavsar avatar

Watchers

 avatar Timotej Gašpar avatar

context_action_framework's Issues

Consider adding hierarchy to labels

Currently, we have labels such as hca_front and hca_back (see https://github.com/ReconCycle/context_action_framework/blob/994624bc639e2e290ac925626f99e7dca04c9ed7/src/context_action_framework/types.py#L30C22-L30C22). For further processing, it would be beneficial if class hierarchy would be added. We could consider concepts from OOP, e.g. that HCAFront inherits from HCA.

In the same fashion, distinguishing between different types (makes) of different electronic devices of the same family could be addressed (e.g. Hekatron can inherit from smoke_detector).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.