Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards v0.0.1 #1

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
/Manifest.toml
/logs
13 changes: 13 additions & 0 deletions Oolong.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
banner = true
color = true

[logging]
log_level = "Debug"
date_format = "yyyy-mm-ddTHH:MM:SS.s"

[logging.driver_logger]
console_logger.is_expand_stack_trace = true
rotating_logger.path = "./logs"
rotating_logger.file_format = "YYYY-mm-dd.\\l\\o\\g"

[logging.loki_logger]
9 changes: 8 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
name = "Oolong"
uuid = "c9dcc2fc-6356-41de-aa29-480ea90c21cd"
authors = ["Jun Tian <[email protected]> and contributors"]
version = "0.1.0"
version = "0.0.1"

[deps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Configurations = "5218b696-f38b-4ac9-8b61-a12ec717816d"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"
GarishPrint = "b0ab02a7-8576-43f7-aa76-eaa7c3897c54"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
LoggingExtras = "e6f89c97-d47a-5376-807f-9c37f3926c36"
LokiLogger = "51d429d1-9683-4c89-86d7-889f440454ef"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
YAML = "ddb6d928-2868-570f-bddf-ab3f9cf99eb6"

[compat]
julia = "1"
Expand Down
200 changes: 110 additions & 90 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,28 @@
# Oolong.jl
<pre>
<img src="./docs/logo.svg" alt="Oolong.jl logo" title="Oolong.jl" align="left" width="180"/>
____ _ | > 是非成败转头空
/ __ \ | | | > Success or failure,
| | | | ___ | | ___ _ __ __ _ | > right or wrong,
| | | |/ _ \| |/ _ \| '_ \ / _` | | > all turn out vain.
| |__| | (_) | | (_) | | | | (_) | |
\____/ \___/|_|\___/|_| |_|\__, | | <a href="https://www.vincentpoon.com/the-immortals-by-the-river-----------------.html">The Immortals by the River </a>
__/ | | -- <a href="https://zh.wikipedia.org/zh-hans/%E6%9D%A8%E6%85%8E">Yang Shen </a>
|___/ | (Translated by <a href="https://en.wikipedia.org/wiki/Xu_Yuanchong">Xu Yuanchong</a>)
</pre>

*An actor framework for [ReinforcementLearning.jl](https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl)*
**Oolong.jl** is a framework for building scalable distributed applications in Julia.

> “是非成败转头空” —— [《临江仙》](https://www.vincentpoon.com/the-immortals-by-the-river-----------------.html)
> [杨慎](https://zh.wikipedia.org/zh-hans/%E6%9D%A8%E6%85%8E)
>
> "Success or failure, right or wrong, all turn out vain." - [*The Immortals by
> the
> River*](https://www.vincentpoon.com/the-immortals-by-the-river-----------------.html),
> [Yang Shen](https://en.wikipedia.org/wiki/Yang_Shen)
>
> (Translated by [Xu Yuanchong](https://en.wikipedia.org/wiki/Xu_Yuanchong))
## Features

## Roadmap
- Easy to use
Only very minimal APIs are exposed to make this package easy to use (yes, easier than [Distributed.jl](https://docs.julialang.org/en/v1/stdlib/Distributed/)).

- Non-invasive
Users can easily extend existing packages to apply them in a cluster.

- [x] Figure out a set of simple primitives for running distributed
applications.
- [ ] Apply this package to some typical RL algorithms:
- [x] Parameter server
- [x] Batch serving
- [ ] Add macro to expose a http endpoint
- [ ] A3C
- [ ] D4PG
- [ ] AlphaZero
- [ ] Deep CFR
- [ ] NFSP
- [ ] Evolution algorithms
- [ ] Resource management across nodes
- [ ] State persistence and fault tolerance
- [ ] Configurable logging and dashboard
- [LokiLogger.jl](https://github.com/fredrikekre/LokiLogger.jl)
- [Stipple.jl](https://github.com/GenieFramework/Stipple.jl)
- Fault tolerance

- Auto scaling

## Get Started

Expand All @@ -44,86 +36,114 @@ pkg> activate --temp
pkg> add https://github.com/JuliaReinforcementLearning/Oolong.jl
```

`Oolong.jl` adopts the [actor model](https://en.wikipedia.org/wiki/Actor_model) to
parallelize your existing code. One of the core APIs defined in this package is
the `@actor` macro.
See tests for some example usages. (TODO: move typical examples here when APIs are stabled)

```julia
using Oolong
## Examples

A = @actor () -> @info "Hello World"
```
- Batch evaluation.
- AlphaZero
- Parameter server
- Parameter search

By putting the `@actor` macro before arbitrary callable object, we defined an
**actor**. And we can call it as usual:
Please contact us if you have a concrete scenario but not sure how to use this package!

```julia
A();
```
## Deployment

You'll see something like this on your screen:
### Local Machines

```
Info:[2021-06-30 22:59:51](@/user/#1)Hello World
```

Next, let's make sure anonymous functions with positional and keyword arguments
can also work as expected:
### K8S

```julia
A = @actor (msg;suffix="!") -> @info "Hello " * msg * suffix
A("World";suffix="!!!")
# Info:[2021-06-30 23:00:38](@/user/#5)Hello World!!!
```
## Roadmap

For some functions, we are more interested in the returned value.
1. Stage 1
1. Stabilize API
1. ☑️ `p::PotID = @pot tea [prop=value...]`, define a container over any callable object.
2. ☑️ `(p::PotID)(args...;kw...)`, which behaves just like `tea(args...;kw...)`, except that it's an async call, at most once delievery, a `Promise` is returned.
3. ☑️ `msg |> p::PotID` similar to the above one, except that nothing is returned.
4. ☑️ `(p::PotID).prop`, async call, at most once delievery, return the `prop` of the inner `tea`.
5. 🧐 `-->`, `<--`, define a streaming pipeline.
6. 🧐 timed wait on `Promise`.
2. Features
1. ☑️ Logging. All messages are sent to primary node by default.
2. 🧐 RemoteREPL
3. ☑️ CPU/GPU allocation
4. 🧐 Auto intall+using dependencies
5. ☑️ Global configuration
6. 🧐 Close pot when it is idle for a period
3. Example usages
1. 🧐 Parameter search
2. 🧐 Batch evaluation.
3. 🧐 AlphaZero
4. 🧐 Parameter server
2. Stage 2
1. Auto1.scaling. Allow workers join/exit?
1. 🧐 Custom cluster manager
2. Dashboard
1. 🧐 [grafana](https://grafana.com/)
3. Custom Logger
1. ☑️ [LokiLogger.jl](https://github.com/fredrikekre/LokiLogger.jl)
2. 🧐 [Stipple.jl](https://github.com/GenieFramework/Stipple.jl)
4. Tracing
1. [opentelemetry](https://opentelemetry.io/)
1. Stage 3
1. Drop out Distributed.jl?
1. 🧐 `Future` will transfer the ownership of the underlying data to the caller. Not very efficient when the data is passed back and forth several times in its life circle.
2. 🧐 differentiate across pots?
3. 🧐 Python client (transpile, pickle)
4. 🧐 K8S
5. 🧐 JuliaHub
6. 🧐 AWS
7. 🧐 Azure

## Design

### Workflow

```julia
A = @actor msg -> "Hello " * msg
res = A("World")
```

Well, different from the general function call, a result similar to `Future` is
returned instead of the real value. We can then fetch the result with the
following syntax:

```julia
res[]
# "Hello World"
+--------+
| Flavor |
+--------+
|
V +-------------+
+---+---+ | Pot |
| PotID |<===>| |
+---+---+ | PotID |
| | () -> Tea |
| | require |
| +-------------+
+-------|-------------------------+
| V boiled somewhere |
| +----+----+ |
| | Channel | |
| +----+----+ |
| | |
| V +-----------+ |
| +--+--+ | PotState | |
| | Tea |<===>| | |
| +--+--+ | Children | |
| | +-----------+ |
| V |
| +----+----+ |
| | Promise | |
| +---------+ |
+---------------------------------+
```

To maintain the internal states across different calls, we can also apply `@actor`
to a customized structure:
A `Pot` is mainly a container of an arbitrary object (`tea`) which is instantiated by calling a parameterless function. Whenever a `Pot` receives a `flavor`, the water in the `Pot` is *boiled* first (a `task` is created to process `tea` and `flavor`) if it is cool (the previous `task` was exited by accident or on demand). Some `Pot`s may have a few specific `require`ments (the number of cpu, gpu). If those requirements can not be satisfied, the `Pot` will be pending to wait for new resources. Users can define how `tea` and `flavor` are processed through multiple dispatch on `process(tea, flavor)`. In some `task`s, users may create many other `Pot`s whose references (`PotID`) are stored in `Children`. A `PotID` is simply a path used to locate a `Pot`.

```julia
Base.@kwdef mutable struct Counter
n::Int = 0
end

(c::Counter)() = c.n += 1
### Decisions

A = @actor Counter()
The following design decisions need to be reviewed continuously.

for _ in 1:10
A()
end
1. Each `Pot` can only be created inside of another `Pot`, which forms a child-parent relation. If no `Pot` is found in the `current_task()`, the parent is bind to `/user` by default. When registering a new `Pot` whose`PotID` is already registerred. The old one will be removed first. This will allow updating `Pot`s dynamically. (Do we really need this feature?)

n = A.n

n[]
# 10
```

Note that similar to function call, the return of `A.n` is also a `Future` like object.

### Tips

- Be careful with `self()`
### FAQ

## Acknowledgement

This package is mainly inspired by the following packages:
This package is mainly inspired by the following projects:

- [Actors.jl](https://github.com/JuliaActors/Actors.jl)
- [Orleans](https://github.com/dotnet/orleans)
- [Proto.Actor](https://proto.actor/)
- [Ray](https://ray.io/)
- [Actors.jl](https://github.com/JuliaActors/Actors.jl)
42 changes: 42 additions & 0 deletions docs/logo.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
using Luxor

scale = 2
ratio = (√5 -1)/2

w, h = scale * 128 * 2, scale * 128 * 2

r1 = w/2
r2 = r1 * ratio
r3 = r2 * ratio
r4 = r3 * ratio

c1 = Point(0, 0)
c2 = c1 + Point(r1-r2, r1-r2) * √2/2
c3 = c1 + Point(r1-r3, r1-r3) * √2/2
c4 = c1 + Point(r1-r4, r1-r4) * √2/2

Drawing(w, h, "logo.svg")
background(1, 1, 1, 0)
Luxor.origin()

setcolor(1,1,1)
circle(c1, r1, :fill)
setcolor(0.251, 0.388, 0.847) # dark blue
circle(c1, r1-4*scale, :fill)

setcolor(1,1,1)
circle(c2, r2, :fill)
setcolor(0.796, 0.235, 0.2) # dark red
circle(c2, r2-4*scale, :fill)

setcolor(1,1,1)
circle(c3, r3, :fill)
setcolor(0.22, 0.596, 0.149) # dark green
circle(c3, r3-4*scale, :fill)

setcolor(1,1,1)
circle(c4, r4, :fill)
setcolor(0.584, 0.345, 0.698) # dark purple
circle(c4, r4-4*scale, :fill)

finish()
Binary file added docs/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 13 additions & 0 deletions docs/logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 5 additions & 7 deletions src/Oolong.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,10 @@ module Oolong
const OL = Oolong
export OL

include("core.jl")
include("parameter_server.jl")
include("serve.jl")

function __init__()
init()
end
include("config.jl")
include("logging.jl")
include("base.jl")
include("core/core.jl")
include("start.jl")

end
Loading