Giter VIP home page Giter VIP logo

roster-hub's People

Contributors

j-nakashima avatar kyoshizaki avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

weiplanet

roster-hub's Issues

Implement more OneRoster endpoints

1st Priority implementations

Service Call Endpoint HTTP Verb Action
getStudentsForClass /classes/{class_id}/students GET Return the collection of students that are taking this class.
getTeachersForClass /classes/{class_id}/teachers GET Return the collection of teachers that are teaching this class.

Remove old backups

  • Remove backup files / directories older than the specified number of days
  • specified number of days: CSV_BACKUP_DAYS
    • Do not remove backup files / directories when CSV_BACKUP_DAYS is 0

Use sourcedId as primary key for table

Issue

  • Consumes a lot of auto-increment id number for each bulk update process

Implementations

  • Use sourcedId as primary key instead of id
  • In order to correspond to many database types and sourcedID typs, set as follows
    • Column type for sourcedId is string, NOT binary
    • limit character num. option for sourcedId is NOT used

Store records related to the past terms

Issue

  • Drop all existing records and insert new records in tables for each import process.
  • The records related to the past terms are deleted from DB.

Fixed Import job procedures

  • Create new records and update existing records
  • Delete class and enrollment records when it was disappeared from CSV file within the stated terms.

Remarks: works for this issue

  • Set unique index for sourcedId in each tables
  • Use "On Duplicate Key Update" option for bulk update
    • ATTENTION: "On Duplicate Key Update" option has DB dependency (MySQL is default DB).
  • Delete class and enrollment records when...

Check dependencies

After processing, check whether the referenced record exists.
If not , log to the file as an Error .

Accelerate CSV import time

Check the time to import CSV

Data size in verification CSV files

  • academicSessions: 5
  • classes: about 3,800
  • courses: about 19,900
  • enrollments: about 97,600
  • orgs: 1
  • users: about 6,400

Import time result

  • Without model associations: about 2'25
  • With model associations: about 13'40
    - With bulk insert: about 0'51

Conclusion

  • Use 'bulk insert' with activerecord-import gem for massive CSV files.

Correspond to the metadata elements

All data models of OneRoster 1.1 can be extended within the metadata elements.
By making RosterHub correspond to this metadata element, various data can be sent within the specification of OneRoster 1.1.

Validate associations operated through API

Issue

  • Validate associations not only CSV imported data, but also API operated data.

Plan

  • Validate associations for each save process
  • Validate associations in ActiveRecord, not in ActiveJob.

Considerations: CSV file import order

Group 1: No references to other tables

  • academicSessions: (parentSourcedId)
  • categories
  • demographics
  • orgs: (parentSourcedId)
  • resources

Group 2: References to other tables in Group 1

  • courses: schoolYearSourcedId(SourcedId of an AcademicSession ), orgSourcedId
  • users: orgSourcedIds, (agentSourcedIds = SourcedIds of the Users)

Group 3: References to other tables in Group 1&2

  • classes: courseSourcedId, schoolSourcedId(SourcedId of the Org), termSourcedIds
  • courseResources: courseSourcedId, resourceSourcedId

Group 4: References to other tables in Group 1&2&3

  • classResources: classSourcedId, resourceSourcedId
  • enrollments: classSourcedId, schoolSourcedId, userSourcedId
  • lineItems: classSourcedId, categorySourcedId, gradingPeriodSourcedId(SourcedId of the academicSession )

Group 5: References to other tables in Group 1&2&3&4

  • results: lineItemSourcedId, studentSourcedId

Keep record updated time at "updated_at" column

Issue

  • The existing records in tables are updated with every bulk-update process even there are no changes , so the values in "updated_at" columns are also updated.

Suggested solution by j-nakashima

[migration file]

class xxxxxx < ActiveRecord::Migration[5.1]

+CREATE_TIMESTAMP = 'DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP'
+UPDATE_TIMESTAMP = 'DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'

  def change
       :
-      t.timestamps
+      t.column :created_at, CREATE_TIMESTAMP
+      t.column :updated_at, UPDATE_TIMESTAMP

[csv_import_job.rb]
Line 109: Add for the new record.

    hash['created_at'] = Time.current
    hash['updated_at'] = hash['created_at']

Line 112(114): Alter

    update_columns = cl.column_names.reject{|c| %w[id sourcedId created_at updated_at].include? c}
    cl.import instances, on_duplicate_key_update: update_columns, timestamps: false

With this way, update timestamps only if record changes.
Does it need to be simpler?
If use "NOW()" instead of "CURRENT_TIMESTAMP",
it might work with Postgresql. (need to confirm)

Consideration

DB dependencies or limitations for migration files

  • MySQL
  • MariaDB
  • PostgreSQL
  • SQLite

Column type change from "datetime" to "timestamp" for MySQL/MariaDB

  • No side effect for this change?

DB version up for production environment

  • Update from MariaDB 5.5 to MariaDB 10.0? or above?
  • The difference between MariaDB and MySQL is getting wider after MySQL 5.6/MariaDB 10.0

Extend API to support creation / update / deletion for some resources

Issue

IMS OneRoster 1.1 for rostering only supports GET method for API accesses. So one can NOT add information on classes not registered in SIS.

Solution

  • Add original API with POST/PUT/DELETE methods for rclasses, enrollments and courses.
  • Every record has a application_id indicating the data source application
    • application_id os primary key for oauth_applications table
  • application_id for the records by CsvImportJob is set to 0
  • For POST/PUT access, JSON payload for the record is needed.
  • For PUT/DELETE access, sourcedId must be indicated in the URL, NOT in the JSON payload.
  • Records generated by API (POST) are NOT deleted by CSV-file sync processes.
  • All resource controllers have create / update / delete actions written in application_controller, but those actions are restricted by routes.rb except the controllers for rclasses, enrollments and courses.

Fix response to more accurately follow the specification

Issues

  • Remove RoR specific items: id, created_at and updated_at
  • Put data model title, like "academicSession(s)", and brace, {}.
  • Error handling for no response data

Response payload sample for "getAcademicSession"

Response (payload): Specification

{"academicSessions": [{"sourcedId":"sample-guid-1","status":null,"dateLastModified":null,"title":"2017-intensive","type":"term","startDate":"2017-04-01","endDate":"2018-03-31","parentSourcedId":null,"schoolYear":2018},{"sourcedId":"sample-guid-2","status":null,"dateLastModified":null,"title":"2017-full","type":"term","startDate":"2017-04-01","endDate":"2018-03-31","parentSourcedId":null,"schoolYear":2018}]}

Response (payload): Current status

[{"id":1,"sourcedId":"sample-guid-1","status":null,"dateLastModified":null,"title":"2017-intensive","type":"term","startDate":"2017-04-01","endDate":"2018-03-31","parentSourcedId":null,"schoolYear":2018,"created_at":"2018-10-13T08:31:29.330+09:00","updated_at":"2018-10-13T08:31:29.330+09:00"},{"id":2,"sourcedId":"sample-guid-2","status":null,"dateLastModified":null,"title":"2017-full","type":"term","startDate":"2017-04-01","endDate":"2018-03-31","parentSourcedId":null,"schoolYear":2018,"created_at":"2018-10-13T08:31:29.328+09:00","updated_at":"2018-10-13T08:31:29.328+09:00"}]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.