Giter VIP home page Giter VIP logo

abjdata's Introduction

abjData

R build status CRAN status

Overview

This package contains a set of databases frequently used by ABJ.

The data included comes from the Human Development Index of the municipalities, collected from the Human Development Atlas and cartographic databases.

The purpose of the package is to make databases available for quick use in other projects and as a resource for the Jurimetrics book.

Installation

install.packages("abjData")
## dev version
# remotes::install_github("abjur/abjData")

Available datasets

Dataset Description
assuntos Data that contains information about case types.
cadmun (LEGACY) A dataset that contains the municipality codes.
muni Useful data from municipalities to join with other databases.
pnud_muni A dataset containing UNDP information from municipalities by years.
pnud_min Minimal base of UNDP municipalities to make quick studies.
pnud_siglas A dataset that serves as a glossary of available acronyms.
pnud_uf A dataset that contains information about UNDP of Federative Units.
leiloes Auctions dataset used in our book.
consumo Consumer cases dataset used in our book.

How to use

Once installed, just load the package and call the dataset you want to use.

The {abjData} package can be loaded like any other R package:

library(abjData)
library(tidyverse)
glimpse(pnud_siglas)
#> Rows: 8
#> Columns: 4
#> $ sigla      <chr> "espvida", "gini", "rdpc", "pop", "idhm", "idhm_e", "idhm_l…
#> $ nome_curto <chr> "Esperança de vida ao nascer", "Índice de Gini", "Renda per…
#> $ nome_longo <chr> "Esperança de vida ao nascer", "Índice de Gini", "Renda per…
#> $ definicao  <chr> "Número médio de anos que as pessoas deverão viver a partir…

Chart examples

Municipal Human Development Index:

pnud_min |>
  pivot_longer(starts_with("idhm")) |> 
  mutate(tipo = case_when(
    name == "idhm" ~ "Geral",
    name == "idhm_e" ~ "Education",
    name == "idhm_l" ~ "Longevity",
    name == "idhm_r" ~ "Income"
  )) |> 
  mutate(
    regiao_nm = fct_reorder(regiao_nm, value, median, .desc = TRUE),
    tipo = lvls_reorder(tipo, c(2, 1, 3, 4))
  ) |> 
  ggplot() +
  geom_boxplot(
    aes(value, regiao_nm), 
    colour = "#102C68", 
    fill = "#7AD151"
  ) +
  facet_wrap(~tipo) +
  theme(legend.position = "none") +
  theme_bw(12) +
  labs(
    x = "IDHM", 
    y = "Region"
  )

Position of municipalities:

muni |> 
  ggplot(aes(lon, lat)) +
  geom_point(size = .1, colour = viridis::viridis(2, begin = .2, end = .8)[1]) +
  coord_equal() +
  theme_void()

Requirements

{abjData} requires R version greater than or equal to 3.4.

License

{abjData} is licensed under the MIT License.

abjdata's People

Contributors

azeloc avatar jtrecenti avatar katerine-dev avatar rcfeliz avatar rmhirota avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

abjdata's Issues

Release abjData 1.0.0

Prepare for release:

  • Check that description is informative
  • Check licensing of included files
  • devtools::build_readme()
  • usethis::use_cran_comments()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('major')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_news_md()
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Update install instructions in README
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

Webscrap TJSP 1 Inst

Boa tarde,

Estou realizando o webscrap da página de processos em 1 instância do TJSP pelo R.
Realizei até o momento o seguinte código:

library(devtools);
library(esaj);
library(xml2);
library(rvest);
library(httr);
library(stringr)

ids <- c( "0000337-06.2013.8.26.0431",
"0000536-96.2011.8.26.0431",
"0000566-63.2013.8.26.0431",
"0000743-27.2013.8.26.0431",
"0001076-18.2009.8.26.0431",
"0001861-09.2011.8.26.0431",
"0001891-10.2012.8.26.0431",
"0002500-32.2008.8.26.0431",
"0002583-43.2011.8.26.0431",
"0003372-08.2012.8.26.0431",
"0003389-44.2012.8.26.0431",
"0003457-91.2012.8.26.0431",
"0003697-22.2008.8.26.0431",
"0004104-57.2010.8.26.0431",
"0004924-08.2012.8.26.0431")
esaj::download_cpopg(ids, "C:/Users/Danilo/Desktop/bases/html/")
h <- arq %>%
xml2::read_html(encoding = 'UTF-8')
todas_partes <- h %>% rvest::html_nodes('table') %>% length()

##RETIRA A TABELA PAI DO ARQUIVO HTML
nodes <- h %>% rvest::html_nodes('table')

##RETIRA A '' PAI DO ARQUIVO HTML

lixo <- nodes %>% rvest::html_nodes('td')

retira tr do arquivo

lixo2 <- lixo %>% rvest::html_nodes('tr')

##Limpa o arquivo
lixo3<- lixo2 %>%
rvest::html_text() %>%
stringr::str_trim() %>%
stringr::str_replace_all('&nbsp', '') %>%
stringr::str_replace_all('[\t]', '') %>%
stringr::str_replace_all('[\n]', '') %>%
stringr::str_replace_all('[\r]', '') %>%
data.frame(stringsAsFactors = FALSE)

No entanto ao final tenho uma tabela onde cada elemento possui diversas informações [29] por exemplo contem todas as informações de "Dados do processo".
Neste sentido, devo fazer um while para separar o conteúdo em colunas ?
qual a melhor prática para construir um base com os dados de processos do TJ-SP?

Agradeço antecipadamente a ajuda;

Abraço,

Objeto br_uf_map ausente na nova versão

Olá, tudo bem?
Primeiro eu queria agradecer pelo pacote que vocês disponibilizam. Tem sido de grande ajuda.

Um dos objetos que utilizamos é o br_uf_map, que parece ausente nessa nova versão. É possível ainda ter acesso a ele de alguma forma? Ele teria trocado de nome?

Missing municipalities in Brazil

When you run the command:

pnud_muni %>% filter(ano == 2010) %>% select(municipio) %>% count()

you get to know that there are 5565 municipalities in the pnud database.

However, it is known that there are 5570 municipalities in Brazil.

I guess there are some missing municipalities in pnud database. However I still am not sure which ones are missing.

Espaço na sigla na base pnud_siglas

Oi ! estava tentando gerar um arquivo de siglas para a pnud_min, e percebi que tem uma sigla com um espaço extra. Arrumei com um case_when simples mas caso queiram alterar no pacote, é na base abjData::pnud_siglas, coluna sigla, e a sigla que contém um espaço extra é a idhm_ e.

Ex:

abjData::pnud_siglas |>
  mutate(
    sigla = case_when(
      sigla == "idhm_ e" ~ "idhm_e",      
      TRUE ~ sigla
    )
  ) |>
  filter(sigla %in% names(abjData::pnud_min))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.