Giter VIP home page Giter VIP logo

urlexpander's Introduction

UrlExpander

GoDoc

Go package providing API for expanding shortened urls from services like goo.gl, bitly.com, tinyurl.com

Features

  • Translates shortened urls as fast as possible by sending lightweight HEAD request to shortening service
  • Uses local cache to handle repeated queries
  • Respects robots.txt in case the shortening service must not be visited by crawlers

Usage

This project can be used either as a library from Go code or it can be used as a standalone service providing HTTP API.

Library

import "github.com/vanekjar/urlexpander/lib"

expander := urlexpander.New()
expanded, err := expander.ExpandUrl("https://goo.gl/HFoP0a")

HTTP API server

Install UrlExpander locally

go get github.com/vanekjar/urlexpander

Run command that will start a local HTTP server (listening on port 8080 by default)

urlexpander

Check running server by visiting http://localhost:8080

API description

Request
GET http://urlexpander.tk/api/expand?url=https://goo.gl/HFoP0a
Response
200 OK
{"original":"https://goo.gl/HFoP0a", "expanded":"http://urlexpander.tk"} 

Configuration

conf := urlexpander.Config{
  // Expanded urls are cached for repeated queries. Set cache capacity.
  CacheCapacity:     cacheCapacity,
  // Set cache expiration time in minutes.
  CacheExpiration:   cacheExpiration,
  // User agent string used when translating shortened url.
  UserAgent:         userAgent,
  // Maximum length of shortened url. It is assumed that no shortened url is longer than that.
  ShortUrlMaxLength: 32,
}

expander := urlexpander.NewFromConfig(conf)

urlexpander's People

Contributors

vanekjar avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

urlexpander's Issues

Prevent too many parallel requests on the same webserver

It's possible to issue enough "random" requests against url-expander at the same time, such that it then itself issues parallel requests to the same target web server, potentially overloading it.

Test case

A simple server to count concurrently handled requests (this be a remote server hosting the short link redirects):

> cat us.go 
package main

import "net/http"
import "sync/atomic"
import "time"
import "fmt"

func main() {
	app := App{reqs: 0}
	http.HandleFunc("/", app.expand)
	http.ListenAndServe(":7777", nil)
}

type App struct {
	reqs uint32
}

func (a *App) expand(w http.ResponseWriter, r *http.Request) {
	rq := atomic.AddUint32(&a.reqs, 1)
	fmt.Printf(">> %d\n", rq)
	time.Sleep(1 * time.Second)
	rq = atomic.AddUint32(&a.reqs, ^uint32(0))
	w.Write([]byte(`{error: "foo"}`))
	fmt.Printf("<< %d\n", rq)
}

Running this and url-expander on the same machine, one way to simulate the problem is as follows.

# 8080 is the url-expander
# 7777 is the us server / our fake target web server
> seq 10 | parallel --verbose -j10 curl --silent "http://localhost:8080/api/expand\?url=http://localhost:7777/{}"

The us server output then shows something along the lines of:

>> 1
>> 2
>> 3
>> 4
>> 5
>> 6
>> 7
>> 8
>> 9
>> 10
<< 9
<< 8
<< 7
<< 6
<< 5
<< 4
<< 3
<< 2
<< 1
<< 0

Unfortunately, I couldn't find any way to grab the remote IP of a once established connection to build up a "live map" of currently executed work. Maybe resolving it upfront using net.LookupIP is just fine (but, unlike Java, it seems Go does no caching in this regard.)

Of course, this issue is not critical in controlled environments; rather just something to consider when exposing url-expander to the public.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.