data-umbrella / event-transcripts Goto Github PK
View Code? Open in Web Editor NEWtranscripts from our recorded events
Home Page: https://www.youtube.com/c/dataumbrella/videos
transcripts from our recorded events
Home Page: https://www.youtube.com/c/dataumbrella/videos
I'm a bit confused with this file. When I navigate through GitHub repo, it seems like it should show a different thumbnail, but it shows an old one:
https://github.com/data-umbrella/event-transcripts/blob/main/2022/41-austin-pymc.md
https://github.com/data-umbrella/event-transcripts/blob/main/2022/53-Sebastian-Adrian-%20PyTorch
For reference, you can see this one:
https://github.com/data-umbrella/event-transcripts/blob/main/2022/52-will-graphql.md
@Cristinamulas
We can set up a quick call to review this, and go over some missing steps.
Timestamps | Description |
---|---|
00:00 | Welcome |
00:13 | Sandra introduces the topic |
1:00 | What is MLOps? |
1:59 | What is DevOps? |
4:08 | What is Continuous Integration/Continuous Deployment(CI/CD)? |
5:48 | ML systems |
5:53 | Machine Learning Lifecycle |
7:28 | Data Team |
8:52 | Why are ML systems different |
10:40 | ML challenges in the Dev process |
10:46 | Experimentation |
11:27 | Reproducibility |
12:59 | Tracking and versioning |
13:54 | Git for Data Science |
15:15 | Automated Testing |
16:27 | Deployment |
17:04 | Monitoring |
19:10 | MLops Practices |
19:26 | Data Management |
21:21 | Model Management |
22:14 | Model Evaluation |
23:15 | Online ML system validation |
25:06 | Responsible AI |
25:46 | Continuous Training(CT) |
28:16 | MLOps Maturity Model |
31:03 | Automated Pipeline |
32:50 | What did we learn? |
34:07 | Books |
34:23 | Sources |
34:50 | Tools Review |
Create transcript file for Mitzi BRMS event
2023/74-mitzi-brms.md
cc: @SangamSwadiK
@melissawm
The transcript [notes] can be added to this file:
https://github.com/data-umbrella/event-transcripts/blob/main/2022/54-melissa-scipy.md
https://github.com/data-umbrella/event-transcripts/blob/main/README.md
Also, please double-check that the links in this file work.
If you have any questions, please ask here.
Add in timestamps here:
https://github.com/data-umbrella/event-transcripts/blob/main/misc/51_rob_wikipedia.md
Pablo Duboue: Solving NLP (Natural Language Processing) Tasks Using LLMs (Large Language Models)
## Timestamps
00:00 Data Umbrella Introduction
02:43 Speaker Introduction + Land Acknowledgment
04:52 Agenda
06:15 NLP history (rule-based, statistical, deep learning)
09:00 What is a Language Model?
10:35 Large Language Models
11:55 Training LLMs - more than just language
12:48 Speaker background
13:35 About this talk - more background
14:26 Section 1: NLP / LLM Tasks - Part-of-Speech tagging
15:48 POS tagging example
16:50 NLP Tasks - Named Entity Recognition (NER), example
17:50 NLP Tasks - Information Extraction (IE), example
19:08 NLP Tasks - Sentiment Analysis, example
20:32 Q&A - data tagging
22:41 Section 2: Prompting 101
22:51: OpenAI API - intro, CLI, Python
25:44 Zero shot - no examples, temperature, output/hallucinations
28:35 Few shot - training data, output, GPT-4
30:17 Handling priors in exemplars
30:40 Chain-of-thought (CoT)
31:13 LLM role
31:43 Recursing
32:23 Learning more - additional resources
33:14 Section 3: Solving NLP Tasks with OpenAI API
33:27 OpenAI POS tagging
34:11 Output is unstable
34:21 Better prompt
34:40 Annotation Manual
36:03 NER prompt, unstable output, MUC-6 locations
38:28 ChatGPT output
38:41 GPT4 output
39:00 Q&A - AGI
40:28 IE prompt - relation extraction, stable output
42:31 Sentiment Analysis prompt
43:26 Additional discourse
44:22 Section 4: Using open source LLMs
44:39 Why open source LLMs
45:44 Issues with open source models
46:33 Examples of open source LLMs
49:29 Conclusions
51:34 Q&A - contributing to new models v. expanding on older ones, LLMs in cell phones, communication changes and abstraction, etc.
## Resources
- https://tellandshow.org/ (community-owned machine learning)
- http://textualization.com/gptwhitepaper/
- http://artoffeatureengineering.com/
- http://wiki.duboue.net/A_Dollar_Worth_Of_Ideas (project ideas)
## Connecting
- LinkedIn: https://www.linkedin.com/in/pabloduboue/
- GitHub: https://github.com/drdub
- Twitter: @pabloduboue
Working on Timestamps for 62 Software Engineering for Probabilistic Programming
Towards #92
Adding timestamps to the description section of the videos on Data Umbrella YouTube channel.
When timestamps are available:
Your helpful contribution is greatly appreciated!!
txt
format)00:00 Introduction
10:00 example
12:23 example
Example PR: #143
Added timestamps for [topic x]
Towards #92
Closes #xxx [replace xxx with related issue, if there is one]
In the following files, this text needs to be removed: ** NEED TO UPDATE **
Timestamp Description
00: 01 Agenda
00: 39 Introduction to Data Umbrella
1: 04 Code of Conduct
1: 24 How to support Data Umbrella
4: 58 Introduce the talk and speaker
6: 15 Speaker introduces herself and topic
7: 48 Background
10: 28 Our datasource: ShareGPT
11: 24 The Vicuna Project
12: 23 Evaluation: GPT-4 as a judge
14: 17 Chatbot Arena: Benchmarking LLMs in the wild
16: 17 Next steps: better benchmark
17: 23 Can we really trust LLM as a judge?
17: 43 Overview
21: 03 Limitations
23: 46 Solutions
24: 29 Positive Side: High Agreement with Humans
26: 35 Summary
30: 36 Human Preference Benchmark and Standardized Benchmark
34: 36 Questions
38: 14 Organizer wrap up
38: 52 Links
Create transcript file for Estela Sales Eng event
2023/75-estela-sales.md
cc: @SangamSwadiK
[ 44 ] Reshama's PR PyMC example
[ 45 ] Oriol's talk
To do:
Meetup event: https://www.meetup.com/data-umbrella/events/294319436/
The name is currently: "Oriol Abil Pla"
It should be "Oriol Abril Pla" (This is what's in the Meetup and everywhere else.)
https://github.com/data-umbrella/event-transcripts/blob/main/README.md
Here is an example of the description included with the YouTube video.
video: https://youtu.be/5c4cb6kvJGE
Event: R, an Ecosystem Where Pythonistas Can Thrive
## Upcoming Events
Join our Meetup group for more events!
https://www.meetup.com/data-umbrella
## Agenda
00:00:00 to 00:05:50 Introduction to Data Umbrella
00:05:50 Ian Introduction
00:56:30 Q&A begins
01:08:30 Demo of RStudio IDE
## Event
This talk will introduce Python users to the R ecosystem. Attendees should expect that by the end of the talk they will understand how to get started with reporting infrastructure in Python and R; and how to use open standards to share data across data products. As part of the discussion, we will see different use cases for both languages, their integration, and common pitfalls.
## Speaker
Ian is a data person with a background in Data Science and DevOps. He has experience consulting within the pharmaceutical industry, government sector, NGOs and educational institutions in multiple countries. As part of his academic background he holds a Master’s degree in Data Science from the University of British Columbia. Outside of work, he is a certified freediver, loves to surf in the Northeast coast of Puerto Rico and cooks spicy food.
## Slides:
https://ian-flores.github.io/r-ecosystem-4-python-slides/slides.html#1
## Resources
https://github.com/ian-flores/r-ecosystem-4-python
@Cristinamulas
Let's keep with our former conventions with file names:
2023/79-gordon-shiny.md
Meetup:
https://www.meetup.com/data-umbrella/events/292848290/
2023/78-jeff-pyomo.md
Meetup:
https://www.meetup.com/data-umbrella/events/292318566/
Finish rest of timestamps:
https://github.com/data-umbrella/event-transcripts/blob/main/misc/49_lauren_jekyll.md
I am working on this video.
@nestornav Can we renumber as follows:
40. Meenal
41. Austin, PyMC
42. Gonzalo, GitHub Actions
Thanks.
Meetup event: https://www.meetup.com/data-umbrella/events/294319559/
Here is a link to the edited transcript by Sethupathy (google docs)
https://docs.google.com/document/d/1HB4TAXa240NxnnTuL5RZVXyRvQBAqEDZ5P3w3uL_97k/edit?usp=sharing
What needs to be done:
Shivay Lamba: Machine Learning in JavaScript: An Introduction to TensorFlowJS
## Timestamps
00:00 Data Umbrella Introduction
03:06 Speaker Introduction
04:17 Presentation Intro - Machine Learning for the Web: Introduction to TensorFlow.JS
06:27 Why do we need machine learning in JavaScript (JS)?
08:02 What is TensorFlow?
09:03 Versatility & language popularity - ML can be used on any platform JS can run
10:30 ML application ideas (e.g. accessible web apps, sound recognition, etc.)
11:38 3 options for using TensorFlow
13:00 Option 1: Use pre-trained models with JS classes
15:14 Real-world examples
19:15 Option 2: Retrain existing neural network models to work with your own data
19:51 Image classification example (100 images) with Teachable Machine (separate tool with downloadable code)
26:24 Pause for Q&A
26:55 Example using Cloud Auto ML for larger image datasets (100,000+)
28:29 Option 3: Coding your own model
29:32 High-level TensorFlow architecture
31:06 Backends and hardware execution
32:07 Chart - Model Inference Performance Only
32:45 Chart - performance comparison between JS and Python of Hugging Face DistilBERT (NLP-based model)
33:05 5 benefits of using TensorFlow on the front end (client side)
33:59 4 benefits of using TensorFlow on the back end (server side)
35:00 Demo code example 1 - image detection
42:35 Demo code example 2 - TensorFlow.JS converter (converting Python model to JS model)
47:24 More resources for learning and inspiration
49:40 Join the community - #MadeWithTFJS
50:34 What will you make? Machine learning is for everyone.
50:57 Q&A (Using PyTorch, using TensorFlow in production) & final thoughts
## Resources
Website / API: https://www.tensorflow.org/js
Models: https://www.tensorflow.org/js/models
GitHub Code: https://github.com/tensorflow/tfjs
Google Group: [[email protected]](mailto:[email protected])
TensorFlow forum: https://discuss.tensorflow.org/tag/tjfs
YouTube playlist: goo.gle/made-with-tfjs
Codepen: https://codepen.io/topic/tensorflow/
Glitch: https://glitch.com/@TensorFlowJS
Sample dataset: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database
Recommended Reading: Deep Learning with JavaScript - https://www.manning.com/books/deep-learning-with-javascript
Book: LearningTensorFlow.js - https://github.com/GantMan/learn-tfjs
edX: https://www.edx.org/learn/javascript/google-google-ai-for-javascript-developers-with-tensorflow-js
@CeeThinwa currently transcribing this video.
I'm working on this one.
@nestornav this video has finally been posted on our YouTube!
Would you be able to update this transcripts file?
file to update:
https://github.com/data-umbrella/event-transcripts/blob/1032b6d42032a00c6ec6e3dee072c758c105db64/2022/40-meenal-array.md
there is reference info here:
https://github.com/data-umbrella/event-transcripts/blob/main/misc/40_meenal_arrays.md
Add in timestamps here:
https://github.com/data-umbrella/event-transcripts/blob/main/misc/50_julia_holoviz.md
==> fixed by Reshama
==> fixed by Reshama
==> fixed by Reshama
Create transcript file for Sandra MLOps event
2023/76-sandra-mlops.md
cc: @SangamSwadiK
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.