Comments (10)
It looks like it's not handling the :EXIT
message here (which is a bug), but I'm not sure why it's terminating there. Do you know why it is terminating? Is the supervisor shutting down for some reason? Can you share the code for Siteguardian.Scheduler.Registry
and Siteguardian.Application.start
?
from citrine.
Application start code:
def start(_type, _args) do
children = [
# Start the Ecto repository
Siteguardian.Repo,
# Start the Telemetry supervisor
SiteguardianWeb.Telemetry,
# Start the PubSub system
{Phoenix.PubSub, name: Siteguardian.PubSub},
# Start the Endpoint (http/https)
SiteguardianWeb.Endpoint,
# Start the Citrine Scheduler
Siteguardian.Scheduler,
# Start a Command Runner
Siteguardian.CommandRunner,
# Start a worker by calling: Siteguardian.Worker.start_link(arg)
# {Siteguardian.Worker, arg}
]
# See https://hexdocs.pm/elixir/Supervisor.html
# for other strategies and supported options
opts = [strategy: :one_for_one, name: Siteguardian.Supervisor]
start_val = Supervisor.start_link(children, opts)
for domain <- Siteguardian.Automation.list_active_domains() do
Siteguardian.Scheduler.put_job(%Citrine.Job{
id: "job_#{domain.name}",
schedule: "* * * * *", # Run every second
task: fn -> TaskRunner.start(domain) end,
})
end
start_val
end
I have been able to narrow it down somewhat and it seems that is related to the following snippet of code:
case System.cmd("/bin/sh", ["-c", command]) do
{output, 0} ->
output
|> format_output
|> output_to_map(%{})
|> IO.inspect()
{:ok, domain}
_ ->
IO.inspect("Nope")
{:error, "Error: unable to load certificate"}
end
that snippet of code is called as part of the Task execution, without it seems to work fine.
from citrine.
Ah okay, I see what's happening. Internally System.cmd
uses Port
.
Here's a note about it:
Internally, this function uses a Port for interacting with the outside world. However, if you plan to run a long-running program, ports guarantee stdin/stdout devices will be closed but it does not automatically terminate the program. The documentation for the Port module describes this problem and possible solutions under the "Zombie processes" section.
And, as per the Port
docs:
On its turn, the port will send the connected process the following messages:
- {port, {:data, data}} - data sent by the port
- {port, :closed} - reply to the {pid, :close} message
- {port, :connected} - reply to the {pid, {:connect, new_pid}} message
- {:EXIT, port, reason} - exit signals in case the port crashes. If reason is not :normal, this message will only be received if the owner process is trapping exits
I'm not 100% sure what the idiomatic Elixir way to handle this is. If I were to guess, I think your code should wrap the System.cmd
call with a Task so it doesn't propagate to the executor. I also think the commit I just added (30f4c57) was probably not necessary.
from citrine.
Thank for this was also looking into a DynamicSupervisor to pass the command execution to a worker
from citrine.
The easiest way may be Task.Supervisor
.
from citrine.
Task.async(fn -> execute_command(command) end)
|> Task.await()
|> IO.inspect()
Will still result on
[error] GenServer {Siteguardian.Scheduler.Registry, "job_coderoncode.com"} terminating
** (FunctionClauseError) no function clause matching in Citrine.JobExecutor.handle_info/2
(citrine 0.1.11) lib/citrine/job_executor.ex:101: Citrine.JobExecutor.handle_info({:EXIT, #PID<0.622.0>, :normal}, %{cron_expr: ~e[* * * * * *], job: %Citrine.Job{extended_syntax: false, id: "job_coderoncode.com", schedule: "* * * * *", task: #Function<1.74548289/0 in Siteguardian.Application.start/2>}, registry: Siteguardian.Scheduler.Registry, timer: #Reference<0.589530647.1897136130.163894>})
(stdlib 3.13.2) gen_server.erl:680: :gen_server.try_dispatch/4
(stdlib 3.13.2) gen_server.erl:756: :gen_server.handle_msg/6
(stdlib 3.13.2) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.622.0>, :normal}
State: %{cron_expr: ~e[* * * * * *], job: %Citrine.Job{extended_syntax: false, id: "job_coderoncode.com", schedule: "* * * * *", task: #Function<1.74548289/0 in Siteguardian.Application.start/2>}, registry: Siteguardian.Scheduler.Registry, timer: #Reference<0.589530647.1897136130.163894>}
from citrine.
You need something to handle that exit event, or else it will bubble up. I actually not sure if the unhandled event is harmful; does it keep running after that?
In any case, you need to wrap your task with something that can trap the exits and handle them accordingly.
from citrine.
Yeah, thats the tricky bit. I'm having a hard time finding the Idiomatic way to handle that. That said I did try adding the handle_info
from your earlier commit which does kinda work; by handling the exit however that also terminates the job
@impl true
def handle_info({:EXIT, _pid, :normal}, state) do
{:stop, :normal, state}
end
[debug] finished job id=job_siteguardian.dev in 0.468161s
[debug] terminating Citrine.JobExecutor with reason: :normal and state=%{cron_expr: ~e[* * * * * *], job: %Citrine.Job{extended_syntax: false, id: "job_siteguardian.dev", schedule: "* * * * *", task: #Function<1.47134453/0 in Siteguardian.Application.start/2>}, registry: Siteguardian.Scheduler.Registry, timer: #Reference<0.3858931454.2454716417.247372>}
I could in theory use my fork and {:EXIT, _pid, :normal}
to reschedule the job but that kinda wrong.
from citrine.
There's a discussion of a similar issue here: https://elixirforum.com/t/supervising-async-tasks/14412/6
I think if it were me, the easiest thing would be to write a tiny GenServer that is wrapped by a Task and then you can trap the exit and handle it accordingly there. There's an example in the thread I just linked to which you could adapt.
from citrine.
@brndnmtthws Thank you that last thread was exactly what I needed, and this is working for now (I think the Genserver might need some tweaking for scale)
Thanks again, I'm going to close the issue
from citrine.
Related Issues (1)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from citrine.