normxu,Norm Inui,github

Hi there 👋🏻,

🔭 I am currently focused on developing cutting-edge multi-modality models capable of generating and understanding text and images natively.

My specific areas of interest include:

Denoising

~~Auto-regressive Generation~~
~~Diffusion~~

PS: I recognize that all diffusion and next-token generation tasks are inherently denoising tasks. (2024/05)

~~Document Understanding & Layout Analysis~~
~~Optical Character Recognition~~
~~Object Detection~~

PS: Looks like all these tasks can be regarded as next-token generation tasks. (2023/12)

In addition to my current work, I have prior experience in Robotics Perception from my Master's studies. I hope my work can be helpful to you.

Feel free to reach out if you have any questions or if there's anything I can assist with!

Norm Inui's Projects

build-dev-image

A quick start of building docker dev images that allows you to specify torch version, base image, python version, etc.

consistent-dynamicntkrope

An Experiment on Dynamic NTK Scaling RoPE

docparser-pytorch

An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents

ernie-layout-pytorch

An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.

layout2graph

An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"

nougat-latex-ocr

Codebase for fine-tuning / evaluating nougat-based image2latex generation models

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

normxu Goto Github PK