Giter VIP home page Giter VIP logo

data-science-work's Introduction

Hi there ๐Ÿ‘‹

I'm Mason (he/him)! I am a PhD in applied machine learning working as a Founding Machine Learning Engineer at Okareo.

Most of my GitHub activity for 2022-23 is visible at @masondelro.

๐Ÿ”ญ Iโ€™m currently working on ...

  • A set of Python tutorials on how to implement ML explainability/fairness methods from scratch.
  • A portfolio of data science projects to refresh my data cleaning/ML skills. Includes the following projects:
    • Tweet sentiment classification with BERT and fine-tuned GPT-3.5

๐Ÿ‘€ Stuff I've worked on

๐Ÿ”Ž Ranking/recommendation model support in TruEra Diagnostics.

See the Colab notebook I wrote for this functionality here: Open In Colab

  • Python (AI service, SDK)
  • Java (Data ingestion, data querying (Trino/SQL))
  • Jupyter Notebook (demo notebook, debugging/prototyping)

๐Ÿ“ฑ Efficient Deep Learning for Massive MIMO Channel State Estimation

My doctoral dissertation. Investigated techniques from statistical machine learning and deep learning to improve compressive CSI estimation in 5G networks.

๐Ÿ—ฃ LPC Vocoder

An implementation of a speech vocoder/synthesizer for a graduate course in Digital Signal Processing. Check it out here (mdelrosa/eec201_final_project).

  • Octave (vocoder implementation)
  • LaTeX (typesetting report)

data-science-work's People

Contributors

mdelrosa avatar zdelrosario avatar

Watchers

 avatar  avatar

Forkers

zdelrosario

data-science-work's Issues

[feedback] c04-gapminder

  • [1] I love the idea of "Epistemic status"; I may actually require something like this in the future....

  • [2] Note that you can add color labels by assigning a variable in aes(); for instance replace this:

gapminder %>%
  filter(year == year_min | year == year_max) %>%
  ggplot(aes(x = continent, y = gdpPercap)) +
  geom_boxplot() +
  geom_point(data = outliers_all, aes(x = continent, y = gdpPercap), color = 'red')+
  facet_wrap(~ year)

With the following for labels:

gapminder %>%
  filter(year == year_min | year == year_max) %>%
  ggplot(aes(x = continent, y = gdpPercap)) +
  geom_boxplot() +
  geom_point(data = outliers_all, aes(x = continent, y = gdpPercap, color = country))+
  facet_wrap(~ year)

[feedback] c05-antibiotics

Item Grade
Effort S
Observed S, 1, 3
Supported S, 2
Assessed S
Styled S
  • [1] With your q1 vis, you could have investigated which bacteria are treatable with which bacteria with greater resolution than what you present in your observations.

  • [2] Also with your q1 vis, some of your points overlap; you could fix this with a position = "dodge" in your geom_point().

  • [3] Seems like in q2 you were still focused on whether MIC < 0.1, rather than the prompt of the question.

[feedback] c01-titanic

Item Grade
Effort S, 1
Observed S, 2
Supported S
Styled U, 3, 4, 5

Comments

[1] Nice job on q1!

[2] Nice job on generalizing the calculation for q4!

[3] We haven't covered this yet, but when you have multiple factors that you're trying to visualize, putting the factors with more levels along coordinate axes is often more effective. Compare:

as_tibble(Titanic) %>%
  filter(Survived == "Yes") %>%
  ggplot(aes(x = Sex, y = n, fill = Class)) +
  geom_bar(stat = "identity", position = position_dodge())

with

as_tibble(Titanic) %>%
  filter(Survived == "Yes") %>%
  ggplot(aes(x = Class, y = n, fill = Sex)) +
  geom_bar(stat = "identity", position = position_dodge())

[4] Also, know that geom_col() is basically a shortcut for geom_bar(stat = "identity").

[5] Your code for q5 would be a lot more succinct with a facet_grid( ~ Sex), rather than the manual-composite plot you've built up, for instance:

df_prop %>%
  filter(Survived == "Yes") %>%
  ggplot(aes(x = Age, y = Prop, fill = Class)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  scale_y_continuous(breaks=c(0.0, 0.25, 0.5, 0.75, 1.0)) +
  facet_grid( ~ Sex) +
  ylim(0, 1.05)

[feedback] c00-diamonds

Item Grade
Effort S
Observed S
Supported U, 1
Styled S

Comments

[1] I disagree with the assertion "cut does not seem to mediate price." Rather I'd say "cut does not seem to significantly mediate price, as compared with carat".\0

[feedback] c02-michelson

Item Grade
Effort S
Observed S
Supported S, 3
Styled S, 1, 2

Comments

[1] I don't understand why you got rid of the knitr::kable() call... it's really great for making pretty markdown tables....

[2] I'm not really sure what you're doing with this ddply call:

mu <- df_err %>%
  ddply(
        "Distinctness", summarise,
        grp.mean = mean(VelocityError),
        grp.sd = sd(VelocityError),
        grp.abs_mean = mean(abs(VelocityError)),
        grp.abs_sd = sd(abs(VelocityError))
        )
mu

We already learned how to use group_by() to do a similar operation

df_err %>%
  group_by(Distinctness) %>%
  summarize(
    grp.mean = mean(VelocityError),
    grp.sd = sd(VelocityError),
    grp.abs_mean = mean(abs(VelocityError)),
    grp.abs_sd = sd(abs(VelocityError))
  )

Personally, I find summaries-grouped to be a lot more readable.

[3] Excellent work posing and testing hypotheses!

[feedback] c03-stang

Item Grade
Effort S
Observed Ni, 1
Supported S
Styled S

Comments

  • [1] For q3 you state "Both properties decrease with increasing thickness, but not linearly." This is slippery: while both properties appear to decrease, we can't conclusively say that they do decrease.

Furthermore we should generally distrust the edges of a regression, as that's where our fit is least trustworthy.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.