Giter VIP home page Giter VIP logo

sangwoo-joh.github.io's Introduction

sangwoo-joh.github.io's People

Contributors

sangwoo-joh avatar kkeundotnet avatar

Stargazers

diff.blog avatar

Watchers

James Cloos avatar  avatar

sangwoo-joh.github.io's Issues

Django Models (ORM)

  • QuerySet은 evaluate 될 때 실제 쿼리가 실행된다.
  • filter 할 때 여러 조건을 동시에 주면, 즉 filter(some_field__in=[a, b, ...])과 같은 쿼리가 있으면 이게 group by로 해석되어서 대부분 느려진다. 차라리 strict equlaity를 여러번 해서 합쳐주는게 좋다. 즉 Q 오브젝트를 이용하면 filter(Q(some_field=a) | Q(some_field=b) | ...) 이게 훨씬 빠르다.
  • 자주 검색하는 필드는 인덱스를 걸 것, 그리고 인덱스 안걸린 필드에 lte 같은 걸로 필터링 하느니 차라리 전체 QuerySet을 다 evaluate 한 다음 pure python logic으로 처리하는게 O(n^2) 이라도 더 빠르다.
  • 자주 사용하는 페이지의 경우 캐싱할 것. Redis가 훌륭. cacheops 쓰면 아주 손쉽게 된다.

내가 보려고 정리하는 멀티코어 OCaml 목차

인트로: The road to OCaml 5.0

  • 동시성(concurrency)과 병렬성(parallelism): 두개 차이점, cpu, 메모리
  • 싱글코어 가비지 컬렉션: 메모리 표현방법+약간의 타입시스템, 왜 gil?(python, java와 비교 - python도 곧 사라진다고 함)
  • 멀티코어 메모리 모델과 parallel gc: 페이퍼를 읽어야댐. local drf
  • 멀티코어 ocaml의 구현: domains, effect. 왜 native support가 필요한지? lwt/async 등의 모나딕의 한계.
  • application: infer_parallel 소개(fork + message passing 기반) -> domains+effect 기반으로 마이그레이션

Idioms

  • hold a candle to ... : ...와 비교하기에 알맞다

Crontab

Registration

crontab -e

min / hour / day of month / month / day of week / command

Docker data root

  • Default: /var/lib/docker
    • 설치할 때 /var 를 따로 용량 큰 하드웨어에 마운트하지 않으면 금방 용량이 다찬다. 특히 코드 빌드하거나 하면..
  • 옮기는 법
    • 일단 서비스를 멈춘다: service docker stop
    • 용량 큰 게 마운트 된곳에 옮긴다: e.g. mv /var/lib/docker /mnt/docker
    • 데몬 설정을 바꾼다: /etc/docker/daemon.json :
      • "data-root": "/mnt/docker"
    • 서비스 재시작: service docker start
    • Profit!

Toy Project Idea

여러가지 토이 프로젝트 아이디어. 댓글로 추가

Top in Life

static analysis domain 에서 Top의 개념..
인생 -> 뭘 하든 죽으면 그만 아닌가?
허무주의..
어떻게 살아야 하나?

xxxx년에 OCaml 프로젝트 템플릿 생성하기

  • ocamlc, ocamlopt: 잊어라
  • ocamlfind: 나쁘지 않지만 잊어라
  • Make: compatibility를 맞추기 위해서 dune과 통합할 거라면 상관없다.
  • OMake: 오마케. 잊어라.
  • Oasis: 뭔지 모름.
  • dune: 표준적 지위를 갖고 있으니 이걸 쓰자. 그런데 문서화가 너무 어렵게 되어 있어서 튜토리얼이 없다...

Docker-in-docker

도커 컨테이너 안에서 도커 컨테이너를 띄워서 뭔가 작업하고 싶은 경우

  • 도커 바이너리를 이미지에 직접 설치하면 동작하지 않음
    • 도커 데몬을 띄우기 위해서 1번 프로세스인 init 필요
    • 실행 중인 컨테이너는 보통 실행에 사용된 커맨드 (e.g. /bin/bash)가 1번 프로세스가 되기 때문에 불가능

결론: 호스트의 도커 설정을 컨테이너와 함께 공유해야함

  • 복잡한 도커 설정을 두 번 하지 않아도 됨
  • 보통 호스트에 미리 받아둔 도커 이미지를 다시 써야 하기 때문에 재활용도 용이

방법: 호스트의 도커 관련 파일을 컨테이너에 마운트. 도커 설정마다 다를 수 있지만 보통...

  • /var/run/docker.sock
  • /usr/bin/docker
  • /var/lib/docker

팁: 호스트 <-> 컨테이너 1 <-> 컨테이너 1이 트리거한 컨테이너 2 사이에서 데이터를 원활하게 주고 받으려면 /tmp도 같이 마운트해서 여기다 관련 데이터를 파일로 쓰고 공유하는 것이 베스트

Story

  • OS basics
    • CPU: process, preemptive thread
    • Memory: virtual memory, resident memory, paging
    • Communication: IPC, shared memory, signal
  • Compile vs. interpreting, in the view of type
  • Garbage collection
    • incremental, generational
    • stop-the-world, algebraic effect
  • next..?

Chomsky Hierarchy

Type 0

  • Unrestricted grammar
  • Recursively enumarable languages
  • Turing machine
  • Contracting

Type 1

  • Context-sensitive grammar
  • Context-sensitive languages
  • Linear-bounded non-deterministic Turing machine
  • Non-contracting

Type 2

  • Context-free grammar
  • Context-free languages
  • Non-deterministic pushdown automata
  • A -> 'a (왼쪽이 논터미널 하나)

Type 3

  • Regular grammar
  • Regular languages
  • Finite state automata
  • A -> a, A -> aB (right-linear)
  • A -> a, A -> Ba (right-linear)

org로 jekyll 블로깅하기

  • jekyll-org: org 파일을 orb-ruby로 컴파일해서 html로 뱉어준다.

이슈가 두 개 있었다.

헤딩 아이디 이슈

orb-ruby가 헤딩 아이디를 생성해주지 않았다. wallyqs/org-ruby#95 요런 PR이 있었다. 3년전 PR인데 리뷰조차 되고 있지 않아서 그냥 내가 가져다 쓰기로 했다.
jekyll-orgorb-ruby를 전부 fork 뜬 다음에 Gem 파일에 내껄로 쓰게 만들고 직접 수정했다.

pygments.rb 이슈

jekyll-org#+begin_src#+end_src에서 언어가 지정된 코드의 syntax highlighting을 다이렉트로 지원하지 않고 pygments.rb를 ㅣ이용한다. 그래서 Gem에 추가하고 로컬에서 테스트하니 잘 되었다. 그런데 액션에서는 실패가 떴다. 원인을 살펴보니 /usr/bin/env python3를 못찾고 있었다. actions/setup-python@v4를 해줬는데도 여전히 못찾았다. jeffreytse/jekyll-deploy-action#55 요런 PR이 있었는데, jekyll-deploy-action은 자체 아치 리눅스 이미지에를 쓰고 있기 때문에 pre_build_commands로 설치를 따로 해줘야 했다. 수정하니 잘 됨.

PCRE

  • [ ]: custom character class
  • (?!<pattern>): 부정 전방탐색
  • (?<!<pattern>): 부정 후방탐색
  • \w: 유니코드 포함 문자열

의미없는 <ul> 지우기

<ul> 안에 <li> 말고 의미없는 <ul>이 들어가는데, 어디에서 추가되는지 추적 후 삭제하기

Language Design

by Signals and Threads | Language Design

Modularity

(Leo)
Anyway, so, strong type system, and the last thing, which I think is more unique to OCaml or at least more unique to the MLs, is that OCaml takes modularity very seriously.

So, what do I mean by that? I mean the best way to describe what I mean by modularity is probably to talk about abstractions. So, the definition of abstraction is essentially that you have some system, and what you want to be able to do is take one component of the system and replace it with another, and the mechanism by which you do that is that you have some notion of interface that describes the behavior loosely of a component, and when you replace one component with another, you replace it with one that provides the same interface as the other one.

And so, this ability to kind of, swap things out is very powerful, and very pervasive in the design of OCaml.

.....

That’s my opinion. So, the most recent thing that looks vaguely ML-ish, is probably Rust, you know, so Rust’s original compiler was written in OCaml, so, I’m pretty sure they had some experience with MLs when they wrote it and you can see some of that coming through in some of the language design, but it lacks a module system, and so, maybe the next ML didn’t have a module system, but I think that’s a mistake. I think that’s a shortcoming in Rust that makes me sad, and makes me less likely to use the language.

It’s one of those things. It’s a bit of a chicken / egg thing. Like I don’t know whether I use OCaml because I like modularity, or I like modularity because I use OCaml. You know, it’s kind of hard to unwind that.

(Ron)
Yeah, like this is the general Stockholm syndrome problem of language design, is you use a programming language and you very used to it and it is hard to keep track of whether you like its features because you use them or whether you use it because you like them.

.....

(Leo)

Yes. Operator overloading, is a mechanism that gives you ad-hoc polymorphism. So, I would say that type classes were a major breakthrough in ad-hoc polymorphism, and they’re still kind of the gold standard to measure your ad-hoc polymorphism features against, but it’s a very kind of, anti-modular feature. So, let’s say, you want an addition function and so you define an addition type class, and then you decide that you want it to work on integers, so you have to, somewhere in your program, write “this is how addition works on integers,” and the important thing there is that it really has to be, just in the entire program, for type class to work, there has to be just one place where you’ve written, this is how addition works on integers.

And so, if you have one library that’s written an addition instance over here, and another one that written another addition instance, you will not be able to link them together. And it’s kind of anti-modular, because you don’t notice this problem until you get to link time. Like you try to link them together and it says, “Oh, no, those two bits have to live in separate worlds. They can’t combine like this. You have kind of broken some of the compositionality of your language.”

(Ron)

I remember being actually quite surprised about this when I first heard about it, because there’s essentially no way of hiding this aspect of your implementation. You cannot in any way, wrap it up such that the particular choices you made about type classes in the middle of your program, aren’t exposed. It’s sort of antimodular in a fairly in your face way.

Mathematical Simplicity

(Leo)

The phrase mathematical simplicity is probably an over-simplification. A lot of people would maybe use the word say elegance or something like that, but I tend to think of the problem in terms of languages having sharp corners, and the interplay of different language features. If your language features are, in some mathematical sense, fairly simple and in particular, composable, then there will tend to be many fewer corners that you can cut yourself on in the language.

I think a language that has not got this is something like C++, which I mean, I’ve never really used it professionally. I’ve used it here and there. But it is a language made up of sharp corners. If you want to understand exactly what happens when you use interesting interplays of templates and operator overloading, and this, that and the other, you can very easily end up with code that kind of does what you want mostly, but then you try and use it on a different type and suddenly it breaks for all kinds of confusing reasons, and what’s happening there is that the features of the language are not really compositional. They’re somewhat ad-hoc and complex, and when they interact with each other, it’s hard to reason about what they’re doing.

.....

The thing I always hear from people who’ve worked with C++, is always like, oh, it’s a really good language as long as you stick to a subset, but none of them agree on what the subset is.

Anyway, so I think that that’s a good example, and so, I mean, maybe another way to think about mathematical simplicity is really, that, when you’re looking at languages and the behavior of programs mathematically, really what you’re doing is trying to formalize how you reason about programs. As programmers, we reason about programs all the time in many diverse and interesting ways, and people attempt to encode that in mathematics, and if you can encode how to think about your language feature, how to reason about its behavior mathematically, because it is simple – because if it’s complicated, it would be very hard to do the maths there – it’s probably just easier to think about. It’s probably easier for people to reason about it, too.

(Ron)

I’m less certain that this is true, but another thing that I’ve come to think works out better in languages that are kind of mathematically grounded, is that features end up doing more than you might think they would. Like, an example that really struck me many years ago was when there’s this thing that was added to OCaml called GADTs, generalized algebraic data types. Without going into the details of what this language feature is, or how generalized algebraic data types generalize on ordinary algebraic data types, it’s a language feature for which the obvious use cases that I had heard about and seen, seemed to all be about the kind of things that you do in constructing compilers of like representing abstract syntax trees or other representations of program fragments. And it seemed like a great tool for compiler writers, but maybe not a great tool for me, since I’m not really a compiler writer. And it turns out, GADTs have been incredibly useful in a wide variety of ways, in ways that surprised us, and I think, surprised people who added GADTs to the language.

It turns out to have been, for us, very useful for doing performance optimization and various kinds of ways of trying to get more control over the memory representation of objects in the language, had been aided GADTs in a way that, I think, A lot of people found pretty surprising, and I tie some of this, uncanny generality of the features to the fact that the features hold together in a way that goes just beyond the initial use cases that were imagined by the people who added that feature.

Evolving a Language Forward

(Ron)

You spend a lot of time working on programming language design, but you never design a new language, or at least very rarely, think about designing an entirely new language. The work you do is mostly about evolving a language forward. How do you think about the difference between designing new language features from scratch and figuring out how to extend an existing language to make it better?

(Leo)

It is harder to evolve an existing language than to work on a new one. I tend to kind of think of writing a language from scratch as kind of doing the whole thing on easy mode, like where’s the fun there? I genuinely enjoy the challenge of trying to fit features into the existing language, which comes in many forms, from frustration that keyboards only have so many symbols on them, and there just really isn’t room for one more symbol in the language… I’ve yet to try and use Unicode emoji for anything, but we’ll get there eventually. We talked earlier about the interactions of features, to just thinking hard about making sure that your feature is composable and sensible and simple, and composes with the stuff that’s already there. And also, you tend to hit technical debt in the language, there are places where people took a shortcut from the language design perspective, and said, “oh, let me just make it do this, it’ll be fine,” often turn out to be the places that are a bit of a nightmare for you later when you write something else. I enjoy the challenge of that more. I also think that it’s really the only feasible way to actually write stuff that’s going to get used by real users to solve real problems.

If I just make a toy language, the odds of anyone using it other than me are almost 0. The mechanics of how languages become popular are confusing and unknown and you are very unlikely to produce the next big language, so, to a certain extent, you don’t really have a choice, if you want to actually solve real problems and actually help real people with their programming problems.

(Ron)

Part of the reason I think you have to spend almost all of your effort on language evolution, is decent languages take forever. Jane Street started using OCaml 18 years ago, and I remember at the time thinking, it was already like, a pretty good language. Like, it had a good implementation. In fact, surprisingly good given that it was an academic language, since academics don’t usually produce reasonable software, and we’ve been working on it for a long time, and there’s still a lot of deficiencies and lots of things to fix. Writing a really great programming language is literally a decades long affair, and so, if you want to do great things, it’s much, much easier if you can glom on to some existing language and figure out how to evolve it, then just come up with some grand new idea.

(Leo)

New languages when they do break through, they tend to always be built around some new idea. Rust, for instance, breaking through is kind of the idea of having some kind of linearity style typing, or ownership typing in a language kind of breaking through. I doubt that Rust has the best ownership story that will ever be. They’re the first one. It’s the big breakthrough. Basically, I just like taking other people’s language stuff and moving it on to OCaml, and then doing it after they’ve already done the hard work of working out where all the problems are, which you normally do, through trial and error, by implementing the wrong thing and then finding out that it was the wrong thing later.

(Ron)

In fact, OCaml has kind of a habit of this, which is language features tend to arrive, I guess, depending on what perspective you have, kind of late or early. You can look at various features in languages like OCaml and think of OCaml as really cutting edge. Lots of languages now are starting in the last 5-10 years to get type inference. OCaml had type inference from the mid-90s, of course, that’s a feature that had already existed in ML, and the ML languages that preceded it for 20 years, right? So even that feature that now seems kind of cutting edge was already kind of old hat when OCaml picked it up.

And OCaml over time, has lots of examples of picking up features after they have been bounced through lots of other language communities, but it takes such an incredibly long period of time for language features to get popular, that you can sort of still be somewhere in the middle and seem both cutting edge and incredibly stodgy at the same time. I guess my favorite example here is like, garbage collection, because it is the single most obviously useful advance in programming languages. Like, people can argue about how valuable types are and I’m a really big proponent of types and all of that, but like, garbage collection. Garbage collection is clearly a useful language feature and it was invented in the mid-50s and got popular in like, ‘95 with Java. You could sort of argue that’s the point where it really hit the mainstream.

(Leo)

Yes. Algebraic data types is another one; that’s like, Hope in 1980-something. From a programming design perspective, the lack of sum types, the lack of types that represent something being either this or this, is just kind of insane. Like, it’s just so obviously dual to like records and structs, which have been in everything since forever, I’m always amazed when a new language comes out without them.

What’s the most recent big language that came out with that… probably Go, like what? How is there not a variant type in there? It’s insane.

(Ron)

Say that again, because I think it’s really easy to miss. There’s this thing that every language on earth does, which is give you a good way of essentially expressing conjunction. I have this and that and the other, and whether you call it a struct or a record or a class, there’s always a way of doing that “and” that conjunction. And there’s, it turns out, there’s the obvious compliment to that, which is disjunction. I would like to have a type that represents this, or that, or the other, and there are always of encoding it. There’s union types in C, and you can have things that inherit from the same class in object-oriented languages, and all of that, but it’s always clunky and second-class, and doesn’t give you the kind of type checking and static guarantees that you really want, and it is, I agree, like the most screamingly obviously useful feature, and I guess, among mainstream languages, Swift now has it.

.....

So another constraint around evolving languages, which you didn’t talk about is backwards compatibility and the pressure from all the existing source code that’s out there. Or that even if you want to make relatively subtle changes in the semantics of a language, it can be very hard to get those changes rolled out, because, there’s millions or billions of lines of code that depend on the current behavior and figuring out where that is and how to evolve that forward, can be extremely challenging. The most extreme terrible case of this I know of is the Python 2 to Python 3 transition, where people tried to make a whole bunch of changes to the language and it took, I think it is starting to get finished up, now, 15 years or something after it started. I don’t know the exact timing but it’s been an incredibly long process.

I’m curious how you think about the role that popularity plays in the ability to evolve your language.

(Leo)

Oh, that’s interesting. Yeah, so, I mean, popularity, I think, does make it harder to evolve your language. I mean, like kind of trivially it does, when the least popular language is used only by me, and then it’s easy. I can just change all my code and then it’s fine. There’s no compatibility issue.

Yeah, so the more popular your language is, the more difficult it is going to be to change it. At least to change it in a backwards incompatible way. It’s worth saying that like, especially if you’re careful, you can make a lot of changes in a way that just doesn’t break anything, and I think, where possible, you should just do that. But sometimes it’s unavoidable. There’s just something that you implemented earlier that’s just wrong, or you know, these just some tiny corner that’s preventing a new feature from working how it should, and so, you just kind of really need to fix an old feature before you can add this new one. That definitely does get more difficult if your language is more popular.

A lot of people are kind of fairly obsessed with the popularity of their language. I’m not indifferent, but it’s not a thing that I spend my time thinking about very much. There’s like a certain amount of popularity that is required for you to have an active community that is producing useful libraries and doing good things, and you want to make sure you have that. Beyond that, I’m less clear on the cost versus the benefit. It’s a mixed bag.

.....

I think one thing that’s worth saying, is that like, to a certain extent, I’m still fairly selfish in my approach to language design. Like, I still kind of think that the point of working on OCaml is to make it easier for me to write programs. And I try and generalize that to encompass everyone or at the very least, the other people at Jane Street, bit I think that in terms of where ideas come from, and I think they kind of come from there.

I said earlier that OCaml was very general purpose, but it’s certainly not great in every niche, in every use case. I’d really like to make it competitive with Rust and C++ and C, for writing low-level hand-tuned code, and that seems plausible to me. I can see a path by which you add certain features and make it so that you can do that in a way that is convenient.

I mean you can get pretty far right now. You can avoid allocating, and so on, and keep your latencies – so you avoid garbage collection latencies, and we have some upcoming features, things like unboxed types that we’re currently working on here, similar to the unboxed types in Haskell, that will give you better control of your layout.


(Leo)

We definitely don’t have the last word. I mean, you can go and go to GitHub and observe us attempt, and fail to get various things, merged upstream to see that we don’t have authority to just go and put whatever we like in.

Yes, so, I think, firstly, it’s worth pointing out how much we benefit from this. If we were to like fork and manage just manage our own language and do our own thing off to the side, we would lose the review from the upstream developers, who’ve been working on OCaml for 30 years and really are the kind of people you want reviewing your code, and all the libraries and tools in OCaml, which we can use that come from the excellent open-source community. So there’s a lot of benefit to, I think, working with an existing language and working with an existing community that you don’t just kind of own. And I think if you look at languages, places where they have kind of spawned the language, they often go to great lengths to try and give the ownership away, essentially, to try and get some of these things. F#, I think is a good example of that, where Microsoft is trying to kind of let the community take a bigger and bigger role, because the input that you get from that doing is great. At the same time, it obviously, kind of slows us down and means that we have to stop and think about things.

So, I think effective communication with upstream is kind of the key to having this process be successful and useful for everyone involved. It’s not a thing we’ve always got right. Sometimes we’ll develop something, completely internally, for far too long, and not really share it upstream, and then just kind of drop this finished thing, and I think that that is a lack of communication that you should try to avoid. You can see how it happens. It’s very tempting to just kind of not show people a thing until you’ve got it just right. You want to add things to the language and for it to make progress, and the process of upstreaming is obviously the mechanism by which that is slowed down, but I think that that’s kind of by necessity. I think language design is quite a slow process, and should be quite a slow process. The backwards compatibility things that we were talking about earlier, I mean, you put the wrong thing in, you will be stuck with it, kind of indefinitely, or you will have to do some huge amount of work later to undo it. So, you really want to be sure that you’re doing the right thing in language design.

So, I think that necessarily makes everything kind of move a bit slowly, and that can be a frustrating process, but pros outweigh the cons.

....

(Ron)

So you were talking a little bit before about forking and why it’s not really a great idea to just straight up fork the language. And indeed, that’s something I get asked about a lot. People are like, Jane Street has a few thousand people and lots of resources, why doesn’t Jane Street just fork? And I think you did a good job of explaining the downside of that. But I wonder how you think about the idea of forking a little, which is to say, some of the constraints you were describing around language evolution have to do with the problems of backwards compatibility. Another constraint that you talked about, essentially, is the need for feedback. You were saying in some sense what OCaml had been doing, the MO has been, wait for someone else to try something out in some other language, work out a theory, write a few papers, try a few awkward implementations in other languages, and then when you see the dust settle, well, okay, pick something up and try to get the best version of that in OCaml. But it’s super tempting to want to do some of that experimentation and it feels like a place like Jane Street, feels like, in some sense, a safer place to do it, because one of the things that we can do, is we can fix things. We have these kind of work flows around these large monorepos where we have an enormous amount of control, and we can add a feature in, expose it only internally, and only turn it on in a few places, and then, when we discover, inevitably, that we did a terrible job of it, we can rip it out and just update all the code that used it, or modify the feature in significant ways, and worry much less about these kinds of constraints of backward compatibility and the unbounded usage of the feature, and by dint of that, get a lot of extra feedback, which can hopefully help us iterate and innovate faster.

What do you think tradeoffs are here? Should we do more of that?

(Leo)

I think basically, it would be good to be able to do this, and the thing that places limits on it is the management of that kind of fork. Like how much rebasing large changes over a moving codebase is a kind of painful and expensive process, and that’s the kind of cost you’re going to pay to have this forked version of the compiler.

To be honest, I think we are leaning more and more in this direction. I think, we’ve always had our own branch, because you want backport a bug fix or something before you move to the next version, and I think that the patches that we are having on our own branch are getting bigger, so I think we are leaning in this direction. But yes, I think the limits on it mostly comes from just the technical aspects of how you manage that fork.

(Ron)

It’s striking, how in some sense, crude, our method of extending some other person’s software is. We literally represent the changes we want to make as a set of textual patches, right? Where you just say, oh let’s like remove this line and put some other line in instead. The surprise there is not how hard that is to maintain, but that it kind of works at all.

(Leo)

Yes, and the thing you can do to do to try and improve it is, obviously, to kind of improve the abstractions in the compiler itself. You know, to kind of put some APIs in place, put some decent interfaces on things, that don’t have to move all the time, that makes it easier for things to evolve independently without breaking each other.

(Ron)

There are whole language communities that work in some sense, this way, and have easier to use ways of extending the compiler. I think of languages like Racket, which is a descendent of Scheme, and is itself a descendant of LISP, which is a language that goes back to the mid-50s, but if you look at a language ecosystem like Racket, everything is designed around making the compiler extensible so that people can write their own extensions without having to fork. Can you imagine a future where we would have more capabilities of that kind inside a language like OCaml?

(Leo)

To me, that kind of Scheme-stye extensibility, working out how you do that in a typed language, in a way where you maintain the type safety, that’s the thing I think of as the holy grail, that’s what a lot of work in, I think, programming language design, is kind of towards. I’m not sure that people doing that work, they’re explicitly thinking in those terms, but certainly, that’s how I see it.

Yeah, so I would basically just say, yeah, that’d be great, but that is an open research problem.

(Ron)

And can you give a little more texture as to why it’s hard? Why does a type system make this kind of extensibility more difficult?

(Leo)

So, if you want to add a new type system feature, to a language, without breaking its safety, the code of that feature, like the implementation that the user is writing, is going to need to have certain properties and you’re going to have to show statically that the code that they’ve written has those properties. So, essentially, you’re going to end up needing a type system that is very expressive in order to express the type of language extension that doesn’t break everything, which is kind of roughly what you want to do.

If you ever tried to implement or prove correct, prove sound, at least, a type system in a theorem prover, which I have done, it is a painful and slow, and very time-consuming experience. I’ve been working on a type system implementation in the Coq theorem prover for like 4 or 5 years. I proved it sound after about two, but it was still two years of work to get to that bit.

Limitation of OCaml

(Ron)

So, what do you think of the limitations of OCaml as a competitor for languages like C++ and Rust, and to be clear, at Jane Street, we in fact, spend a lot of time writing very high-performance programs in OCaml, so as you said, it is a thing that you can do. But there are certainly ways in which it’s not optimal, so, I’m kind of curious if you can try to lay out what you think the core limitations are.

(Leo)

Sure. So, I guess, control of memory layout is just crucial to performance on a modern processor and you don’t have as much control of your memory layout in OCaml as you do in those languages. There is some cost to using garbage collection. So sometimes you would like to avoid it, which you can do now in OCaml, but when you do it in OCaml, you lose the ability to create things like records and structs of data and the variant types that we were just talking about, you lose the ability to use those, really. So, you go from a language which is generally higher level and easier to program in than C to actually losing some of the things that C has, because OCaml relies on garbage collection to provide those features.

.....

Yes. I mean, it’s worse than that. Like my favorite would be if you used like OCaml’s normal support for 32-bit integers, it’s like 10 words per 32-bit integer, because you’ve got to box the thing, and then the boxed thing is a few words, so that’s clearly room for improvement. That is not the smallest and simplest way to represent 32 bits of data, and so that specific case is something we’re looking at, quite a lot with the unboxed types, that’s our approach to fixing that aspect.

So another thing that these low-level languages give you is the ability to allocate things on the stack and refer to those, have pointers into your stack, basically, that gives you a much cheaper than garbage collection way of allocating data, without having to copy all the, you know, you can make a struct on your stack and then pass the pointer forwards down the stack, down to functions that you call, which gives you a cheap way of grouping some data together and passing around a reference to that data.

So that’s another feature that you just can’t really do in OCaml.

.....

Yes, So, if you want to keep the kind of safety that OCaml users are used to, then you have to do that. I mean, you can also just be C and let people screw it up, but that’s, you know, not great. That’s kind of the thing that leads to tons of security bugs, so, it’s not ideal.

Yes. Those are probably the main things that I think of that the low-level languages provide that we don’t really have. Another one, I suppose, it kind of goes hand in hand with the ability to control, is having facilities for conveniently reasoning about resources that you hold. This is kind of by necessity. Those languages lack a garbage collector, so when you allocate memory, this is a resource that you have to pay attention to, because you got to free it yourself, and so, they provide a lot of useful mechanisms to help you with that process.

But the mechanisms that are good for helping you manage a piece of memory that you have allocated, are also great for managing like, a file you’ve opened, or a database connection that you’ve made. Things that need closing when you’re done with them, so, those languages have some great features for that, and I think OCaml could definitely benefit from brazenly stealing them.

.....

So, the type system is expressive, but it is not the most expressive. There are more interesting type systems that let you express more properties of your program. In particular, I’m thinking of the dependently typed languages, so, this is theorem provers, like Coq and I guess Lean, and languages like Agda and Idris. Dependently typed languages are maybe a bit difficult to explain in a short period of time. Classic example would be the ability to have a type for lists of length N, where N is not some fixed constant. N is like a variable in your program. Like, you know, someone passes you a number N, and the list that’s as long as that, the type system knows that it’s exactly that long. So, if N is 0, it knows that the thing is empty, and if it’s 1, it knows that you don’t have to worry about it, you can just take the first element, you don’t have to think about whether it’s empty or not.

That’s the quintessential example of a dependent type. But that’s very expressive.

Yes. So, the type of those lists was depending on the value of the integer or natural number that you… N, basically.

So, anyway, I’d like to push OCaml in some ways in that direction. So, Haskell is doing this in a big way. There’s been long-running projects to kind of make Haskell more expressive like this, and I would like to do similar things in OCaml, especially, kind of around the module system, which is already, vaguely related to dependently typed languages in some ways.

(Ron)

If you could make a pitch to a grizzled old systems programmer like myself, as to why I should care about this fancy type theoretic dependent types nonsense, like, why will this make life better as an OCaml programmer to have easier access to dependent types in the language?

(Leo)

It’s quite easy to make that argument, since what GADTs are giving you is what this is giving you, but less broken. GADTs are to some extent, a broken version of dependent types that are not actually quite the thing you want, and so, since as you said earlier, you’ve already been convinced of the use of GADTs, it’s not that hard to take you the rest of the way and I’d probably do that. I’m not quite sure how you take someone who’s never written anything using something like GADTs and show them that this is how they want to write their programs.

...

Yeah, sure. So, I think that people currently use, in Jane Street, they will write code where they have a type and the type is parameterized by some notion of permission. Basically, whether this is a read thing or a writing thing. A file is an obvious example for this. A file handle rather. Is this a read-only file handle or a write-only file handle, or is it read and write? So, the way that they achieve that is they make up a type called “read” and a type called “write,” and then they have their file type be a parameterized type. It has a type parameter, and that type parameter is either going to be the read type or the write type. That’s kind of how they achieve that. But this is clearly kind of nonsense. This is just like an encoding to get the type checker to check what they actually want. Read is not a type like Int is a type. There are no values of type read. It’s kind of just a fiction they’re creating to get the behavior that they’re after.

And so, dependent types would let you do that properly by actually, rather than indexing your type by these made-up types, you would index your type by an actual value of a sum type, you know, literally just an enum in C that’s either read or is write. And you would have a read file and a write file, and they would be an actual data type that you could actually write functions on and things like that.

(Ron)

This actually reflects a problem we’ve run into a lot, which is there’s lots of places where you try and get some extra power out of the type system, and so you do some clever encoding to get the type system to check an extra property that you want, but I think people are often surprised at how error prone it is to get that encoding just right, because as you say, it’s kind of disassociated with what you’re really doing. OCaml programmers are in some sense spoiled. They’re used to the fact that there’s a large class of errors that the type system just automatically captures for them, and they’re kind of used to things clicking together. But when you start doing these kind of encoding tricks to add extra rules into the type system that aren’t naturally there, they end up being kind of free floating and it’s very easy to just screw it up and you think you’ve encoded some permission incorrectly, and you could just get it wrong, and it’s very hard to notice that you got it wrong.

So, it sounds like having something that’s kind of more organized around dependent types would ground this kind of coding, so it would be less likely to just be wrong in a silent way.

(Leo)

Yeah. That’s right. Every time people are doing that encoding, it like, dependent types are roughly the feature that lets you actually just directly write what they’re trying to do, and it’s always going to be easier if you’re not trying to encode something. You can just write down what you actually mean. It’s just simpler. And like, that’s how I think of that feature. It’s really about doing a thing that people already do, as soon as you give them anything that’s a bit like GADTs, they straightaway start doing these kind of encoding stuff, and really that’s telling you that this wasn’t the feature you’ve should have given them in the first place. It’s dependent types is what they’re after.

It’s a ton more work than implementing GADTs, so yes, there’s a reason one comes first. I would love to add that to OCaml in some capacity or other.

WSL 2 + emacs 28

  • WSL 2: wsl --set-version 키워드로 업그레이드할 수 있음. Ubuntu 20.04 설치 후 버전을 바꾸니 성능이 엄청나게 좋아짐.
  • Ubuntu 20.04 에서 emacs 28 빌드 하니 font rendering이 안됨.
  • apt-get 으로 깔아도 여전히 안됨.
  • Ubuntu 18.04 를 깔아서 apt-get 으로 설치하니 됨. 하드웨어 이슈로 인한 linux kernel 버전 문제일까?
  • 좀더 파보니 emacs 25 는 잘 되고 26 이상은 다 안되는 게 더 정확한 현상으로 보임.
  • 끈닷넷님의 도움으로 우리는 혼자가 아님을 알아냄:
  • 결론
    • emacs 26부터 적용되는 double buffering 기능을 VcXsrv 1.20.1.2 미만 버전에서 지원하지 않아서 벌어진 일.
    • 내 VcXsrv 버전은 1.18.xx 였음. latest인 1.20.xx로 설치하니 apt-get 으로 받은 emacs 26, 27 모두 잘 되는 것을 확인!
    • 시대가 많이 바뀌었다. 인텔이 추락하고 윈도우에서 리눅스를 돌리는 시대라니. 이제 정말로 리눅스 개발할거면 맥북이 아니라 윈도우를 써야 한다 (by 끈닷넷)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.