Giter VIP home page Giter VIP logo

emip-toolkit's Introduction

Hi hi, I'm Naser Al Madi

Assistant Professor of Computer Science at Colby College

Visiting Scholar at Harvard

nalmadi

Connect with me:

nasermadi @prof_naser

emip-toolkit's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

emip-toolkit's Issues

Add new dataset - Eye Tracking Analysis of Code Layout, Crowding and Dyslexia - An Open Data Set

Add a parser (or use existing one if possible) for reading data from the following dataset: https://dl.acm.org/doi/fullHtml/10.1145/3448018.3457420

Requirements:
Create a Jupyter Notebook to show that all functions and methods work with the new dataset.
Add dataset and reference to the dataset dictionary in the code.
Make sure dataset can be downloaded automatically and unzipped using existing methods.

Samples variable hold fixations, saccades, and blinks instead of raw samples in Al Madi 2018 dataset

In the read_EyeLink1000 function, fixations, saccades, and blinks were parsed into the samples variable. An example can be seen below:

if token[0] == "EFIX":
timestamp = int(token[2])
duration = int(token[4])
x_cord = float(token[5])
y_cord = float(token[6])
pupil = int(token[7])
fixations[count] = Fixation(trial_id=trial_id,
participant_id=participant_id,
timestamp=timestamp,
duration=duration,
x_cord=x_cord,
y_cord=y_cord,
token="",
pupil=pupil)
samples.append('EFIX' + ' '.join(token))

The same token that is used to populate the fields of the Fixation object was also appended into samples. If the Al Madi 2018 dataset does not have raw samples, the samples variable should be kept empty to avoid any confusion.

error in emtk/util/_get_stimuli.py

The dimensions used for pasting the stimuli onto the background are incorrect:
chrome_3hOOvq1ybe

Instead of (100, 375), they should be (0, 375) to allow users to see correct positions of fixations on the text.

Name Issue for Parser

We wanted to be more specific with the names of two parsers, so we decided to change the name from read_FileType to read_EyeTrackerName, and add the file type as a parameter.

Complete community profile

EMTK doesn't have a community profile yet, so it is not clear how people can contribute to the open-source project. To help build a community around the tool, we need a few well written documents. You can contribute these documents by looking up tutorials and checking the repositories of popular open-source projects. The documents we need are:

1- Code of conduct
2- Contributing
3- License
4- Issue templates
5- Pull request template

not using the variable "sample_duration"

At line 69 of idt_classifier.py, it should be
[timestamp, len(window_x) * sample_duration, statistics.mean(window_x), statistics.mean(window_y)])
not
[timestamp, len(window_x) * 4, statistics.mean(window_x), statistics.mean(window_y)])

Enhancement - Save image background size as a field of Experiment class

In draw_trial method of Trial class:

def draw_trial(self, image_path, draw_raw_data=False, draw_fixation=True, draw_saccade=False, draw_number=False,
draw_aoi=None, save_image=None):
"""Draws the trial image and raw-data/fixations over the image
circle size indicates fixation duration
image_path : str
path for trial image file.
draw_raw_data : bool, optional
whether user wants raw data drawn.
draw_fixation : bool, optional
whether user wants filtered fixations drawn
draw_saccade : bool, optional
whether user wants saccades drawn
draw_number : bool, optional
whether user wants to draw eye movement number
draw_aoi : pandas.DataFrame, optional
Area of Interests
save_image : str, optional
path to save the image, image is saved to this path if it parameter exists
"""
im = Image.open(image_path + self.image)
if self.eye_tracker == "EyeLink1000":
background_size = (1024, 768)
background = Image.new('RGB', background_size, color='black')
*_, width, _ = im.getbbox()
# offset = int((1024 - width) / 2) - 10
trial_location = (10, 375)
background.paste(im, trial_location, im.convert('RGBA'))
im = background.copy()
bg_color = find_background_color(im.copy().convert('1'))
draw = ImageDraw.Draw(im, 'RGBA')
if draw_aoi and isinstance(draw_aoi, bool):
aoi = find_aoi(image=self.image, img=im)
self.__draw_aoi(draw, aoi, bg_color)
if isinstance(draw_aoi, pd.DataFrame):
self.__draw_aoi(draw, draw_aoi, bg_color)
if draw_raw_data:
self.__draw_raw_data(draw)
if draw_fixation:
self.__draw_fixation(draw, draw_number)
if draw_saccade:
self.__draw_saccade(draw, draw_number)
plt.figure(figsize=(17, 15))
plt.imshow(np.asarray(im), interpolation='nearest')
if save_image is not None:
# Save the image with applied offset
image_name = save_image + \
str(self.participant_id) + \
"-t" + \
str(self.trial_id) + \
"-offsetx" + \
str(self.get_offset()[0]) + \
"y" + \
str(self.get_offset()[1]) + \
".png"
plt.savefig(image_name)
print(image_name, "saved!")

With the background-size:

background_size = (1024, 768)

and trial location:
trial_location = (10, 375)

I suggest saving them as fields of the Experiment class instead of declaring them arbitrarily without any context because the coordinates of the Fixations depend on the background-size and the trial location.

Add unit tests

So far we have been using the example notebooks as tests, but it would be much better to develop unit tests for every method in the toolkit. Maybe consider automating the testing process on GitHub to make collaboration and onboarding easier.

Generate a synthetic set of fixations and eye movements

For the sake of testing ideas like fixation correction, it would be great to be able to generate a "synthetic" eye movements according to some model or maybe completely random.

This can be used for testing code and demonstrating features as well.

Eliminate inheritance in class design

Initially, we chose inheritance for class design. Every eye movement element was modeled as a super class, while the subclasses being the specific eye movements from various types of eye trackers. However, we think it would make it difficult when we add support for more types of eye trackers in future development. Now, @sdotpeng will eliminate inheritance in our program, making universal class of eye movement for various eye trackers.

add_srml_to_AOIs, add_tokens_to_AOIs are not extendable for future datasets

These functions manually matches the name of the stimuli with the name of the original code file, from which the stimuli was adapted. An example can be seen below (this is taken from add_tokens_to_AOIs):

EMIP-Toolkit/emip_toolkit.py

Lines 1222 to 1245 in d1a7eab

if image_name == "rectangle_java.jpg":
file_name = "Rectangle.java"
if image_name == "rectangle_java2.jpg":
file_name = "Rectangle.java"
if image_name == "rectangle_python.jpg":
file_name = "Rectangle.py"
if image_name == "rectangle_scala.jpg":
file_name = "Rectangle.scala"
# vehicle files
if image_name == "vehicle_java.jpg":
file_name = "Vehicle.java"
if image_name == "vehicle_java2.jpg":
file_name = "Vehicle.java"
if image_name == "vehicle_python.jpg":
file_name = "vehicle.py"
if image_name == "vehicle_scala.jpg":
file_name = "Vehicle.scala"

This needs to be refactor to make the two functions extendable for future datasets.

Getter issue for sample number and eye movement number in Trial class

Initially we had a function get_sample_number to returns the total number of eye movements. Later, we decided to store the raw sample in the Trial class. Thus, this function should now return the number of raw samples, while another function called get_eye_movement_number can do the first job.

Add a dynamic integration of srcML into the add_srcML function

The add_srcML currently uses pre-generated files for the EMIP dataset code. It does not generate srcML tags for any piece of code.

It would be great to integrate srcML into the tool so it is called automatically (behind the scene) to generate the srcML tags for any code, then add the tags to the dataframe.

This means that we will add srcML as a dependency, so let's see if we can do this in an easy way. Not sure if srcML is downloadable through pip or similar. This might create problems for our automated Action testing, if srcML is not downloadable through pip.

A good start would be at srcML website to understand the tool and how it works: https://www.srcml.org/

Create a web documentation EMTK

An automated web documentation for EMTK would make it easier to understand methods and functions. Also, it would provide a helpful reference for tool users.

Adapt eye movement classes to empty attributes

Since we decided to remove inheritance and use universal class for eye movement #7, we have to adapt the code that allows empty attributes, in those situation where one type of eye tracker doesn't record one or more types of eye movement. @sdotpeng is in charge.

Add datasets in a directory called "datasets"

Initially, there was only EMIP dataset. After we added some dataset from Eye Link 1000, we found out that the organization of each dataset is different. We now want to add datasets in one directory called "datasets", and we will separate each into a folder.

In future development, we will write up instructions for importing dataset.

(Bug) In the Jupyter notebook for the the Eyelink1000

im = EMIP[subject_ID].trial[trial_num].draw_trial(image_path, draw_raw_data=False, draw_fixation=True, draw_saccade=False, draw_number=True, draw_aoi=True)

When draw_saccade is set to true it gives an error claiming a font is missing

Merge draw_trial implementations into one simple function

Since multiple classes are being merged into one, the implementation of the draw_trial should not make assumptions about the specific trial it is drawing. Initially we wanted to create a unified visualization style, but that might not work for every trial since variations in background colors and style are possible. Instead, we want the draw_trial method to allow the user to customize the visualization with various color and style options.

draw_trial method - How to paste an image with a transparent background onto a larger black background image

In the draw_trial method of the Trial class:

def draw_trial(self, image_path, draw_raw_data=False, draw_fixation=True, draw_saccade=False, draw_number=False,
draw_aoi=None, save_image=None):
"""Draws the trial image and raw-data/fixations over the image
circle size indicates fixation duration
image_path : str
path for trial image file.
draw_raw_data : bool, optional
whether user wants raw data drawn.
draw_fixation : bool, optional
whether user wants filtered fixations drawn
draw_saccade : bool, optional
whether user wants saccades drawn
draw_number : bool, optional
whether user wants to draw eye movement number
draw_aoi : pandas.DataFrame, optional
Area of Interests
save_image : str, optional
path to save the image, image is saved to this path if it parameter exists
"""
im = Image.open(image_path + self.image)
if self.eye_tracker == "EyeLink1000":
background_size = (1024, 768)
background = Image.new('RGB', background_size, color='black')
*_, width, _ = im.getbbox()
# offset = int((1024 - width) / 2) - 10
trial_location = (10, 375)
background.paste(im, trial_location, im.convert('RGBA'))
im = background.copy()
bg_color = find_background_color(im.copy().convert('1'))
draw = ImageDraw.Draw(im, 'RGBA')
if draw_aoi and isinstance(draw_aoi, bool):
aoi = find_aoi(image=self.image, img=im)
self.__draw_aoi(draw, aoi, bg_color)
if isinstance(draw_aoi, pd.DataFrame):
self.__draw_aoi(draw, draw_aoi, bg_color)
if draw_raw_data:
self.__draw_raw_data(draw)
if draw_fixation:
self.__draw_fixation(draw, draw_number)
if draw_saccade:
self.__draw_saccade(draw, draw_number)
plt.figure(figsize=(17, 15))
plt.imshow(np.asarray(im), interpolation='nearest')
if save_image is not None:
# Save the image with applied offset
image_name = save_image + \
str(self.participant_id) + \
"-t" + \
str(self.trial_id) + \
"-offsetx" + \
str(self.get_offset()[0]) + \
"y" + \
str(self.get_offset()[1]) + \
".png"
plt.savefig(image_name)
print(image_name, "saved!")

This line of code pastes an image in the AlMadi 2018 runtime dataset (an image with white text and transparent background), hereafter referred to as "the image", into a black background, hereafter referred to as "this feature":

background.paste(im, trial_location, im.convert('RGBA'))

1st Question: How does converting the image into RGBA and using it as a mask image manage to achieve this feature?

My expectation is that to achieve this feature, we only need to paste the image on top of the black background without having to use any mask image:

background.paste(im.convert('RGBA'), trial_location)

Because the background of the image is already transparent, and the word is white, contrast with the black background. However, what I get is a completely white box on a black background:

result

2nd Question: Why the line of code I wrote fail to achieve this feature?

Here is full code I used to test both ways:

    image_path = "EMIP-Toolkit/datasets/AlMadi2018/runtime/images/5667346413132987794.png"
    im = Image.open(image_path)

    background_size = (1024, 768)
    background = Image.new( 'RGBA', background_size, color='black' )

    trial_location = (10, 375)

    # background.paste( im, trial_location, im.convert('RGBA') )
    background.paste( im.convert('RGBA'), trial_location )
    background.save("result.png")
    
    im = background.copy()
    im.show()

Parse samples into dataframe / a list of objects instead of list

The samples field of the Trial class stores the raw samples from datasets. The field is currently a list of samples, with each sample represented by another list. The field should be, however, a dataframe, with each row corresponding with one sample, or a list of objects, with each object corresponding with one sample. This way, it will be clearer what features each sample has.

Represent each sample with a list can lead to the use of magic numbers to access sample's information. An example can be seen below:

if self.eye_tracker == "SMIRed250":
for sample in self.samples:
# Invalid records
if len(sample) > 5:
x_cord = float(sample[23])
y_cord = float(sample[24]) # - 150

Change fixation filter to free function and call it “idt_classifier”

Initially the function fixation_filter in the Trial class removes invalid samples and classifies fixations based on I-DT algorithm. Now, we want to divide the tasks so that idt_classifier does only classifying job, while adding one more free function that only removes the invalid samples.

Visualization: video reconstruction of a trial

Add a new visualization to generate a video of a trial based on stimuli image and fixation (and possibly saccades) timestamps. The fixation position should appear as a circle and the video should be in real time (not faster/slower than recording).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.