Giter VIP home page Giter VIP logo

blog's People

Contributors

arrowrowe avatar at15 avatar codeworm96 avatar commouse avatar dependabot[bot] avatar gaocegege avatar hebingchang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blog's Issues

[post] How ANTLR and its runtime works

Type

  • request post from @at15

Related

Description

While working on gce4, a lot of time is spent on parser, it's not the main focus for RDBMS.
But writing ast in unit test kills productivity, and hand written parser requires too much work for large language like SQL.
Besides gce4 is implemented in multiple languages.

It's not that straight forward to use ANTLR. For example:

  • Rust runtime is not official yet, and the author seems to be inactive for a while. Besides we had a proposal for rust target dyweb/mos#20 back in 2017
  • Go generator and runtime can be optimized by using type switch
  • Cpp generator have some small problem w/ cpp20 (sign ... cpp

In order to work on the rust runtime, I need to know ANTLR works. Though writing a runtime and generator template is not that hard because the algo is in ANLTR itself and one can reference other language's runtime.

Update

  • change issue type from request to discuss

[util] HTTPS support

Type

util/toolchain

Description

Since the domain *.dongyueweb.com supports https, we may change the generator to support https links globally.

@gaocegege

[post] How Elasticsearch uses Lucene's index time join to handle nested objects

Type

  • request post from @at15

Related

None so far

Description

When indexing object with nested array of objects, the default behaivor in ES is often not expected because it flatten the data.
For example, when searching from tshirts https://blog.mikemccandless.com/2012/01/searching-relational-content-with.html

{
   name: dyweb
   specs: [
      { size: xl, color: blue},
      { size: xxl, color: red}
   ]
}

If you index it directly in ES, it will flatten into:

{
   name: dyweb
   specs.size: [xl, xxl]
   specs.color: [blue, red]
}

The flatten form will match query such as name = dyweb & size = xl & color = red, which is actually invalid, because the xl shirt is blue while the red shirt is xxl ...

The solution in ES is using nested https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
Under the hood it is using lucene's join module, which provides both index time (using a block) and query time join.

If time is allowed I plan to cover

  • The code path in ES
    • store to engine
    • generate nested query
    • collect matched document
  • How to do this using lucene directly with a demo project, we might need a new repo to save demo code like https://github.com/cloudflare/cloudflare-blog does

Update

  • 2022-12-19 Init 不会咕了 再咕我今年还是单身 :doge:

[post] Graceful shutdown a http server from its handler in Rust warp

Type

  • request post from @at15

Related

Description

NOTE: The up to date note is in http://doc/github.com/at15/rust-learning/lib/warp/

I want to have a /shutdown route that allows me to shutdown a http sever by hitting that endpoint.
However it is not that easy in Rust compared with Go (create a context/channel for cancellation, start two go routine, one blocks on server, one wait for cancel). Rust code is harder because it uses ownership instead of GC and has mutability rules.

The post will contain the following

  • how to do graceful shutdown in Go
  • the iterations of getting the shutdown logic working in Rust
    • closure trait Fn, FnMut, FnOnce
    • channel (in tokio)
    • passing a channel to Fn closure
    • mutex

Update

[post] How to take smart note reading note

Type

  • request post from @at15

Related

Description

Recently I've been reading the boo How to take smart notes. And I found it helpful both reading paper and scheduling daily workflow (not limited to writing). Create this issue to reminds me to write a post after I have finished the book (half way through it now).

Some key points mentioned by the book

  • build notes from bottom up
  • separate different types of note, fleeting note, permanent and project specific

Update

Add pullapprove and detail for contribution

  • contribution.md
  • pr template
  • issue template
  • issue labels
  • pullapprove integration
  • the workflow for submitting a new post

@gaocegege @arrowrowe

Update

  • fix typo depoly in CONTRIBUTING.md mentioned by @JasonQSY
  • reduce blog reviewer to 2
  • show hint for creating author page when author is not presented
  • show yaml to let user copy and paste into config.yaml, or use some dark magic to do string replace instead of dump yaml, latter would erase all the comment
  • create a team for blog and add to this project's collabrator so more people like @JasonQSY can review

[post] gommon format: A customized goimports for formatting go code

Type

  • request post from @at15

Related

Description

goimports works better than gofmt because it groups import. However, sometimes I want to enforce extra rules like putting proto imports at the bottom, group imports from current project together, warn/error if some packages are imported e.g. github.com/pkg/errors.

The code is still in progress in dyweb/gommon#127 (well 5+ months PR ... 咕咕.... And the post plans to cover the following

  • how gofmt and goimports works
  • how to parse go code, basic struct of go ast
  • modify go ast and dump it back to text
  • why we design gommon format this way

Update

[post] Function calling convention in C, Go, Rust and OS

Type

  • request post from @at15

Related

Description

While working on the blogos tutorial, I started looking at ABI for function call in different programming languages

The post is expected to cover the following

  • how procedure call is implemented (based on csapp chap3), transfer control, pass parameter etc.
  • dump asm for different programming languages
  • a short intro to asm so you can read the dump
  • return value larger than the rax register (i.e. use stack ...)
  • specific things about writing handlers in OS, e.g. why interrupt handlers are different

Update

[post] ANTLR parser generator and visitor pattern

Type

  • request post from @at15

Related

Description

While working on gegecece and writing SQL parser (ANTLR rule). The main problem is converting parser tree to AST.
AST in SQL is (unresolved) logical plan. There are many ways to define the rule and to traverse the parser tree.
A common pattern is called visitor pattern. However, when looking at code in some dbs e.g. tidb, es. they didn't build the plan in a very visitor way. This post tries to address the following questions

  • what is visitor pattern, how to implement one (in different languages)
  • use visitor pattern in ANTLR
  • not using visitor pattern in ANTLR (when it comes to SQL)
  • alternative label in ANTLR & visitor

Update

[post] Error handling in go, rust, java, cpp and gRPC

Type

  • request post from @at15

Related

Description

NOTE: The draft is in go/pl/doc/error.md

You can't avoid error handling in any programming language (like you can't avoid death and IRS).
Although the philosophy and runtime implementation are different (error value, result, exception), there are still many things in common (e.g. wrap, unwrap, multiple errors, custom error type, error in a different thread).

Furthermore, when it comes to a cloud native (or k8s native) micro service architecture, you need to pass error across the wire.
Clearly http status code can't meet the complexity of your highly available, multi (hybrid) cloud, global scale, low latency, high throughput, custom json to yaml converter. A well defined interface error format is needed, it can be generated from different programming languages, serialized and passed through different rpc frameworks (json, gRPC) and shows the trace across different services, libraries and functions.

The post plans to cover the following

  • all the languages
    • error wrapping, context, recovery
    • common errors (in runtime/stdlib)
    • common error libraries
  • go
    • the most basic Error() string
    • the newer go error interface proposal and implementation, errors.Is, fmt.Errof("%w") etc.
    • go error libraries in popular projects (k8s, tidb etc.)
  • rust
    • the ?
  • java
  • different java exceptions (I remember there two kinds, you need to write throws for one of them)
  • cpp
    • the overhead of exception in cpp
  • os
    • cpu exception
    • syscall error
    • kernel error (from user and kernel itself)
  • rpc
    • http status code
    • grpc
      • define detail error instead of just using builtin errors
    • serialize error
    • trace error
    • error collecting services, e.g. Sentry

Update

  • 2020-08-07 20:01 Init

[post] Write a web terminal in go using websocket

Type

  • request post from @at15

Related

Description

My ubuntu is too old and guake does not support split screen terminal in that version, so I decided to use one in browser. Also was working on udash example ...

The key of a web terminal is ... add a \n if the input does not have one, otherwise you will only read what you write (because a pseudo terminal is actually a file /dev/ptmx) and without a \n bash don't take any action .... (took me a while to figure it out)

Update

[post] How Elasticsearch aggregation and Lucene facet work internally

Type

  • request post from @at15

Related

Description

ES has (multiple) aggregation. Lucene (and Solr) has facet.

Need to explain

Update

[post] How to build a distributed load generator

Type

  • request post from @at15

Related

Projects by @at15 from gradschool

Description

We love doing bechmark, having good performance in benchmark does not mean the system is good,
but having poor results means the system cannot handle the workload.

How to generate different type of workload

Popular frameworks

Let's build one for http API (just for simplicity)

  • In Go
  • In Java, using thread pool vs reactive (i.e. async)
  • In Rust, (if I still remember how to

Update

[util] Continuous Deployment Support

Type

  • util/workflow

Related

Description

Now We have to run pre-deploy every time we update the source code. the work should be done by our lovely travis.

[post] Render and annotate PDF file using pdf.js

Type

  • request post from @at15

Related

Description

Recently I am building a reference management system for papers (I plan to read.jpg).
pdf.js has little documentation and most commercial products are too expensive.

This post describes:

  • how to add a text layer so you can select and highlight it
  • write your own viewer in vanilla js (or maybe some jQuery)
  • add annotation layer (that save annotation outside pdf file)
  • code walk through of some existing implementation, web viewer included in pdf.js, react-pdf-highlighter

It should serve as a good starting point for ppl frustrated with mendeley and want to write their own paper management system.

Update

  • change issue type from request to discuss

[Suggestion] More Posts?

Type

  • Discuss about more posts on our blog

Description

Since a semester has ended, we might put some interesting posts in various fields like Web, Distributed System, Fancy Toys, etc. Posts on our blog should be hence encouraged, both to skilled students and freshmen.

A series of tutorials, in my personal view, could be an alternative start.

Update

TBD

@JasonQSY @gaocegege @vinx13

[post] Survey of existing Go logging libraries

Type

  • request post from @at15

Related

  • #7 Ayi, where gommon/log first comes to play
  • dyweb/gommon#26, where the real survey goes

Description

There are many logging libraries in Go, however, most of them are not that handy if you have used logger in other languages (Java, C++), the performance is not good as well (i.e. using runtime to get stack and obtain line number). In this post we will walk through code of popular go logging libraries like builtin log package, logrus, zap (fast), apex/log etc, and talk about what makes them fast/slow, some gotcha in the go language itself. At last we will proposal our new design for gommon/log while the previous version is mostly modeled after logrus, the new version is more Java-ish.

Update

[post] How docker bind mount works

Type

  • request post from @at15

Related

Description

I often use bind mount as a way to 'share' between host and container. However, I don't know how it works.
I don't even know how mount w/o docker works. This post aims to solve the puzzles around bind mount in docker.

I plan to cover the following and the target audiences are ppl w/ little linux background (like me)

  • vfs
  • mount namespace
  • write a simple file system (both fuse & in kernel?)

Update

[post] How to use go's builtin template

Type

  • request post from @at15

Related

Description

Go's built in text/template and html/template is pretty useful for generating code, custom output (e.g. docker, kubectl, helm).
However every time I need to use it I need to learn it again ... this post should list common syntax (and pitfalls)

  • reference variables
  • writing for loop
  • custom function

And common places to use go template

  • code generator
  • docker
  • helm

Update

  • 2020-08-07 Add usage section

[post] Build go project with private dependency using docker multi stage build

Type

  • request post from @at15

Related

Description

Spent one day at work writing Dockerfile and Makefile. The code at work relies on many internal projects, on Jenkins, the Jenkins worker has access to all internal GithHub projects, however if I use a fully dockerized build locally and get vendor using glide install or dep install, it will fail because no access. The solution is to add ssh private key into container when build, and by using multi stage build, the build image is discarded right away and the runner image does not contain credentials.

Also had some interesting stuff like need to base64 encode and decode private key content, chmod 0400 ~/.ssh/id_rsa, add known host.

Update

[post] How to use pprof in Go

Type

  • request post from @at15

Related

Description

Benchmark w/o profiling is not very useful, you need to know where is the bottleneck.
There are many pprof posts online. However, every time I need to use pprof I found most top ranked search result are not good enough and outdated.

This post plans to cover the following

  • get profile data
    • in test using -cpuprofile
    • using http handler w/ custom route
    • dump to specific place in application code
  • read profile data
    • what is cumulative
    • why the sample time is larger than actual execution time (hint: you got more than one core)
    • flamegraph (honestly I found the default svg more useful than flamegraph for simple use cases)
  • internal of profile format
    • the proto
    • analysis pprof data in your own code
  • pprof in other languages
    • pingcap has one in rust
    • not sure about cpp and java, google might have it?
  • compare with perf
  • efficient storage of pprof data
    • this is mainly for benchhub ... I want to find a low cost way for compressing and querying profile data

Update

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.