Giter VIP home page Giter VIP logo

spring-git-scraper's Introduction

GitScraper

An app that scrapes user data from GitHub.

Table of contents

Overview

The challenge

Users should be able to:

  • Retrieve API data about a GitHub user by pinging a REST endpoint.
  • See the user data displayed as JSON.

UML Diagram

Architecture Diagram

Screenshot

My process

Built With

  • Spring Boot v2.7.7
    • Spring Web
    • Spring Cache
    • Spring Retry v1.3.2
    • Spring AOP (required dependency for Retry)
  • Java 11
  • Project Lombok
  • Testing
    • JUnit 5
    • AssertJ v3.24.1
  • Springfox API Documentation v3.0.0
  • GitHub REST API

How to Scrape Data - Native

  1. Start the app using the ./gradlew bootRun command
    • If on Windows, run: gradle bootRun
  2. Ping the REST endpoint with command: curl -v localhost:8080/scraper/api/v1/git/${username} | json_pp, or use Postman.
    • Replace ${username} with a valid Github username String
  3. The endpoint will return your desired user data as JSON.

How to Scrape Data - Docker (Optional)

  1. Ensure you have Docker installed, and if you dont, go here
  2. Pull the image from my Docker Hub: docker pull belum/spring-git-scraper:latest
  3. Check if the image was downloaded successfully: docker images
  4. Run the image with: docker run -it -p8080:8080 belum/spring-git-scraper:latest
  5. Interact with the endpoint using the Native instructions

What I Learned

I learned how to get Jackson JSON to serialize JDK 8 Date/Time types.

    implementation 'com.fasterxml.jackson.datatype:jackson-datatype-jsr310:2.13.4'
    testRuntimeOnly 'com.fasterxml.jackson.datatype:jackson-datatype-jsr310:2.13.4'
@Configuration
public class ApplicationConfig {
  @Bean
  public ObjectMapper objectMapper() {
    ObjectMapper mapper = new ObjectMapper();
    mapper.registerModule(new JavaTimeModule());
    return mapper;
  }
}

I learned how to cache the data using Spring Cache.

implementation 'org.springframework.boot:spring-boot-starter-cache'
@Configuration
@EnableCaching
public class ApplicationConfig {
  @Bean
  public CacheManager cacheManager() {
    SimpleCacheManager cacheManager = new SimpleCacheManager();
    cacheManager.setCaches(List.of(
            new ConcurrentMapCache("users"),
            new ConcurrentMapCache("repos")
    ));
    return cacheManager;
  }
}
@Component
public class GithubClientImpl implements GithubClient {

  private HttpEntity<String> httpEntity() {
    HttpHeaders headers = new HttpHeaders();
    headers.set("Cache-Control", "public, max-age=60, s-maxage=60");
    return new HttpEntity<>(headers);
  }
}
@Service
public class GithubServiceImpl implements GithubService {
    
  @Override
  @Cacheable(value = "users")
  public GitUser getUserData(String username) {
      //blank for brevity
  }

  @Override
  @Cacheable(value = "repos")
  public List<GitRepo> getRepoData(String username) {
    //blank for brevity
  }
}

Useful resources

Author

spring-git-scraper's People

Contributors

belums avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.