smctwelve / f1-bot Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 4.0 9.85 MB

🏁 A Discord bot to view F1 stats.

License: MIT License

Python 99.58% Dockerfile 0.42%

bot data data-visualization discord ergast f1 formula1 motorsport pandas pycord racing statistics

f1-bot's People

Contributors

Stargazers

Watchers

Forkers

devcloudspace ramachetan rich-howell alexdelli

f1-bot's Issues

Memory optimisations

Issue
Process memory accumulating after repeated command execution and seemingly not being freed. Memory usage would spike when processing large data e.g. telemetry and not return to normal after, often leading to over 1GB reserved memory when leaving the bot running for some time.

Investigating with tracemalloc pointed to FastF1 and Pandas with the highest consumption, which makes sense given the data processing involved and that they are primarily intended for single run batch processing rather than a continuous application.

Mem at start: ~180MB
Mem after /plot-telemetry: ~550MB
Mem after multiple commands: ~1GB

Reserve blocks are not freed when the command is complete, nor does it appear to be reused for the next command as additional memory gets allocated on each execution.

Solution
#21

Not much can be done about the core Pandas memory consumption but using memory_profiler I was able to narrow down where in the bot code large chunks were being allocated. Unsurprisingly using fastf1.load_session which involves a lot of Pandas operations; more surprising was matplotlib.pyplot figures which would often allocate over 200MB on each execution and then linger in memory. Researching showed some discussion related to pyplot figures sometimes not being garbage collected.

Replacing plt.subplots with directly instantiating Figure showed less allocated memory used. Calling plt.close() when saving, combined with deleting intermediate DataFrames after processing and forcing collection with gc.collect() after execution shows memory consumption closer to what I expect.

Peak usage can still spike to ~750MB for a few seconds when processing large uncached data, but returns back to the ~200MB range once the command has executed instead of remaining reserved.

Further optimisations
Usage may still slowly creep up over time but not as drastic as before. I'm not sure what else can be optimised that is not related to internal Pandas/FastF1 functions.
Perhaps replacing the use of BytesIO memory buffer in utils.plot_to_file before uploading with a temp file on disk.

Get results by season

Add season parameter to functions in data.py to allow overriding the BASE_URL with a season, will still default to current if none specified.

Create automated Docker builds for the project

As discussed in #9, this Docker image is useful to people as they can inject the token environment variable without having to build and maintain the image themselves.

Improvements

Usability:

Move legend for race position plot to the left so it aligns with driver order
Remove default help command (or send it via PM) and link to readme instead
Fix plot pos and plot laps so it works with default season and round if none given (drivers is already optional, season and round will still be required before drivers)
Use simpler table formatting to reduce space
Send large tables as DM's or show partial table with external link for full results
pitstops should require a driver rather than show all stops, plot stints already does that more clearly
Handle !f1 with no command given as help message
Remove laps command or require a specific lap number or range of laps, posting 70 rows is not ideal
pm on/off command to temporarily disable sending results as PM for that user

Data:
Using Ergast API, add functions in api.py:

-- Needs a reliable API source

Penalties
Component usage

Analysis:
stats command group to perform statistical analysis and output graphs in Discord:

Driver vs Driver
- Points
- Wins
- Poles
- Fastest laps
- Penalties

Calendar times do not account for daylight savings

Time parser currently formats to '%H:%M UTC' per datetime format codes which does not account for future races held during daylight savings.

Ideally the bot should output times correct to the user locale, like https://www.f1calendar.com/, which must somehow be determined beforehand through Discord. Using a command like !f1 shchedule should list all races with the correct starting times accurate to the race date.

Example:
Getting https://ergast.com/api/f1/current/next returns Australian GP, 17 March 05:10:00Z.

By the race date, UK and EU will be on daylight savings time so the race will actually be 06:10:00 BST for a user in the UK.

Solution:
Implement a function or library to account for races held during daylight savings. Convert the race times received from Ergast API to be accurate to the user's local time on the date of the race.

Issues regarding certain commands. (version 2.0.4)

Hi, I've been using this repo for quite a while now, modifying and experimenting with certain things, etc. I currently face few issues:
1: the /career command seems to not work only for some drivers, for example, hamilton, verstappen, etc. However, other driver names work. (I have tried both driver codes and full surnames.) I see this error: Cannot operate on a closed database. Now when i first installed this, this command worked just as fine, I did make changes in season and plot commands, but i do not think that would've affected the existing career command as I have not touched the API folder. I run this via powershell. I've started to think that this error is due to API changes, but i'm not sure. (UPDATE: I have tried version 2.0.3 and in that, the career commands works fine for all drivers that had issues in version 2.0.4)

2: It has been a few days since Jeddah GP and usually result data should be out within few hours of race end. However the result data is actually not updated at the time of writing. I've tested other telemetry commands pertaining to the aforementioned GP but they work fine. This is the error i see: Session data unavailable. If the session finished recently, check again later. I also feel this is due to internal changes in the api itself. Sorry, fastf1 recently released 3.3.1, it was due to that

Command and Plotting suggestions

New ideas for stats and visualisations to add.

Suggestions and help are appreciated.

Championship points timeline
- Accquire points standings accumulation from each weekend up to date of command execution
Eligible WDC winners
- Not sure of best approach for this
Analysis of team/car performance
- Tried average laptimes/speed but graphs weren't all that interesting (average times are pretty close, fastest times already shown from driver fastest laps). Needs a better approach; maybe using a quicklaps threshold.
Driver lap difference to average track laptime and track record
Experiment with live timing telemetry
- Probably not feasible within the scope of a bot

new plot gap issue

Hi, after update I have this error, using /plot gap or /plot avg-lap-delta, only on Singapore gp

command: /plot gap sai lec 2023 15 Race

2023-09-18 09:39:05,154 f1-bot: INFO Command: /plot gap in Alexdelli f1-analysis by alexdelli#0
core WARNING No lap data for driver 18
core WARNING Failed to perform lap accuracy check - all laps marked as inaccurate.
Ignoring exception in on_interaction
Traceback (most recent call last):
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 124, in wrapped
ret = await coro(arg)
^^^^^^^^^^^^^^^
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 978, in _invoke
await self.callback(self.cog, ctx, **kwargs)
File "/home/container/f1/cogs/plot.py", line 648, in gap
telemetry = {
^
File "/home/container/f1/cogs/plot.py", line 649, in
d: s.laps.pick_drivers(d).pick_fastest().get_car_data(interpolate_edges=True).add_distance()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/container/.local/lib/python3.11/site-packages/fastf1/core.py", line 2910, in get_car_data
car_data = self.session.car_data[self['DriverNumber']].slice_by_lap(self, **kwargs).reset_index(drop=True)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
KeyError: nan
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/container/.local/lib/python3.11/site-packages/discord/client.py", line 378, in _run_event
await coro(*args, **kwargs)
File "/home/container/.local/lib/python3.11/site-packages/discord/bot.py", line 1167, in on_interaction
await self.process_application_commands(interaction)
File "/home/container/.local/lib/python3.11/site-packages/discord/bot.py", line 848, in process_application_commands
await self.invoke_application_command(ctx)
File "/home/container/.local/lib/python3.11/site-packages/discord/bot.py", line 1118, in invoke_application_command
await ctx.command.dispatch_error(ctx, exc)
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 421, in dispatch_error
await wrapped(ctx, error)
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 104, in wrapped
ret = await coro(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/container/f1/cogs/plot.py", line 727, in cog_command_error
raise error
File "/home/container/.local/lib/python3.11/site-packages/discord/bot.py", line 1114, in invoke_application_command
await ctx.command.invoke(ctx)
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 375, in invoke
await injected(ctx)
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 124, in wrapped
ret = await coro(arg)
^^^^^^^^^^^^^^^
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 1312, in _invoke
await command.invoke(ctx)
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 375, in invoke
await injected(ctx)
File "/home/container/.local/lib/python3.11/site-packages/discord/commands/core.py", line 132, in wrapped
raise ApplicationCommandInvokeError(exc) from exc
discord.errors.ApplicationCommandInvokeError: Application Command raised an exception: KeyError: na

Implement caching

Requesting data from the Ergast API is inefficient when data is unlikely to change between race weekends. Some commands such as lap timings take a long time for the API to respond.

Cache API results in Redis aioredis library. When processing commands check the cache first. Explore ways to vary cache expiry, particularly for race weekends where data may be continuously changing between sessions.

Possible Solutions:
Default to a long (or even persistent) expiry for the cached data. Get the current race calendar from API and implement a background process to periodically check the date is within +/- X days to the next race. This allows updates from the API from race results to be fresh, otherwise between races when data is static it can persist. The cache can periodically be flushed when data expires.

Add !f1 refresh admin command to manually flush the cache.

Alternatively
Clone the Ergast PostgreSQL database and explore ways to speed up queries, particularly where multiple joins are required such as lap data.

Raise Errors instead of return None

Return None is ambiguous when the calling function expects data. A None condition indicates missing or invalid data from API which is appropriate to handle as an Error.

Refactor api.py functions to raise exception if get_soup() returns None. Catch errors with discord.py on_command_error() handler when invoked in commands.py.

E.g.

[. . .]
if soup:
        # data present, get results
        return results
    # soup is none, data missing
    return None

becomes

[. . .]
if soup:
        # data present, get results
        return results
    # soup is none, data missing so raise error
    raise MissingDataError()

Driver codes/no substitute surname

API endpoints require driver_id which is based on surname. Ease of use is improved by allowing commands to parse driver codes or number instead and replace them with the correct surname id.

Need a way to automate getting details of every driver supported by the API.

Solutions:

Download Ergast DB image and substitue driver codes/no with ID via queries at runtime
Automate pulling every driver from API and store details in JSON or persistent Redis key

telemetry for two drivers

Hi, it's possible to integrate the command for the lap comparison for two driver, with subplots for Speed, gear, throttle, brake, drs
thank you

   @plot.command(name="telemetry"
                  description="Plot the comparison of fastest lap of two driver")
    async def telemetry(ctx, arg1, arg2, arg3, arg4, arg5):
        year = arg1
        track = arg2
        session_time = arg3
        driver_1 = arg4
        driver_2 = arg5
        await ctx.send("{},{},{},{},{}".format(year, track, session_time, driver_1, driver_2))
        progress_bar = await ctx.send('=>...............')
        session = ff1.get_session(int(arg1), str(arg2), str(arg3))
        await progress_bar.edit(content = '===>.............')
        session.load()
        await progress_bar.edit(content = '======>..........')
        laps_driver_1 = session.laps.pick_driver(str(arg4))
        laps_driver_2 = session.laps.pick_driver(str(arg5))
        await progress_bar.edit(content = '=========>.......')
        fastest_driver_1 = laps_driver_1.pick_fastest()
        fastest_driver_2 = laps_driver_2.pick_fastest()
        await progress_bar.edit(content = '============>...')
        telemetry_driver_1 = fastest_driver_1.get_telemetry()
        telemetry_driver_2 = fastest_driver_2.get_telemetry()
        await progress_bar.edit(content = '=============>..')
        delta_time, ref_tel, compare_tel = ff1.utils.delta_time(
            fastest_driver_1, fastest_driver_2)
        team_driver_1 = laps_driver_1['Team'].iloc[0]
        team_driver_2 = laps_driver_2['Team'].iloc[0]
        color_1 = ff1.plotting.team_color(team_driver_1)
        color_2 = ff1.plotting.team_color(team_driver_2)
        if color_1 == color_2:
            color_2='#ffffff'
        await progress_bar.edit(content = '================> Done!')  
        await ctx.send (str(driver_1)+':'+str(fastest_driver_1['LapTime'])[11:19])
        await ctx.send (str(driver_2)+':'+str(fastest_driver_2['LapTime'])[11:19])
        
        # Set the size of the plot
        plt.rcParams['figure.figsize'] = [20, 15]

        # Our plot will consist of 7 "subplots":
        #     - Delta
        #     - Speed
        #     - Throttle
        #     - Braking
        #     - Gear
        #     - RPM
        #     - DRS
        fig, ax = plt.subplots(7,
                                gridspec_kw={'height_ratios': [1, 3, 2, 1, 1, 2, 1]})

        # Set the title of the plot
        ax[0].title.set_text(f"Telemetry comparison {driver_1} vs. {driver_2}")

        # Subplot 1: The delta
        ax[0].plot(ref_tel['Distance'], delta_time, color=color_1)
        ax[0].axhline(0)
        ax[0].set(ylabel=f"Gap to {driver_2} (s)")

        # Subplot 2: Distance
        ax[1].plot(telemetry_driver_1['Distance'],
                    telemetry_driver_1['Speed'],
                    label=driver_1,
                    color=color_1)
        ax[1].plot(telemetry_driver_2['Distance'],
                    telemetry_driver_2['Speed'],
                    label=driver_2,
                    color=color_2)
        ax[1].set(ylabel='Speed')
        ax[1].legend(loc="lower right")

        # Subplot 3: Throttle
        ax[2].plot(telemetry_driver_1['Distance'],
                    telemetry_driver_1['Throttle'],
                    label=driver_1,
                    color=color_1)
        ax[2].plot(telemetry_driver_2['Distance'],
                    telemetry_driver_2['Throttle'],
                    label=driver_2,
                    color=color_2)
        ax[2].set(ylabel='Throttle')

        # Subplot 4: Brake
        ax[3].plot(telemetry_driver_1['Distance'],
                    telemetry_driver_1['Brake'],
                    label=driver_1,
                    color=color_1)
        ax[3].plot(telemetry_driver_2['Distance'],
                    telemetry_driver_2['Brake'],
                    label=driver_2,
                    color=color_2)
        ax[3].set(ylabel='Brake')

        # Subplot 5: Gear
        ax[4].plot(telemetry_driver_1['Distance'],
                    telemetry_driver_1['nGear'],
                    label=driver_1,
                    color=color_1)
        ax[4].plot(telemetry_driver_2['Distance'],
                    telemetry_driver_2['nGear'],
                    label=driver_2,
                    color=color_2)
        ax[4].set(ylabel='Gear')

        # Subplot 6: RPM
        ax[5].plot(telemetry_driver_1['Distance'],
                    telemetry_driver_1['RPM'],
                    label=driver_1,
                    color=color_1)
        ax[5].plot(telemetry_driver_2['Distance'],
                    telemetry_driver_2['RPM'],
                    label=driver_2,
                    color=color_2)
        ax[5].set(ylabel='RPM')

        # Subplot 7: DRS
        ax[6].plot(telemetry_driver_1['Distance'],
                    telemetry_driver_1['DRS'],
                    label=driver_1,
                    color=color_1)
        ax[6].plot(telemetry_driver_2['Distance'],
                    telemetry_driver_2['DRS'],
                    label=driver_2,
                    color=color_2)
        ax[6].set(ylabel='DRS')
        ax[6].set(xlabel='Lap distance (meters)')

        # Hide x labels and tick labels for top plots and y ticks for right plots.
        for a in ax.flat:
            a.label_outer()
        fig.savefig("telemetry.png")
        with open('telemetry.png', 'rb') as f:
            picture = discord.File(f)
            await ctx.send(file=picture)
        plt.clf()
        telemetry_driver_1['Driver'] = driver_1
        telemetry_driver_2['Driver'] = driver_2

        telemetry = pd.concat([telemetry_driver_1, telemetry_driver_2])
        num_minisectors = 25
        total_distance = max(telemetry['Distance'])
        minisector_length = total_distance / num_minisectors

        minisectors = [0]

        for i in range(0, (num_minisectors - 1)):
            minisectors.append(minisector_length * (i + 1))

        # Assign a minisector number to every row in the telemetry dataframe
        telemetry['Minisector'] = telemetry['Distance'].apply(lambda dist: (int(
            (dist // minisector_length) + 1)))
        # Calculate minisector speeds per driver
        average_speed = telemetry.groupby(['Minisector','Driver'])['Speed'].mean().reset_index()

        # Per minisector, find the fastest driver
        fastest_driver = average_speed.loc[average_speed.groupby(
            ['Minisector'])['Speed'].idxmax()]
        fastest_driver = fastest_driver[[
            'Minisector', 'Driver'
        ]].rename(columns={'Driver': 'Fastest_driver'})

        # Merge the fastest_driver dataframe to the telemetry dataframe on minisector
        telemetry = telemetry.merge(fastest_driver, on=['Minisector'])
        telemetry = telemetry.sort_values(by=['Distance'])

        # Since our plot can only work with integers, we need to convert the driver abbreviations to integers (1 or 2)
        telemetry.loc[telemetry['Fastest_driver'] == driver_1,
                        'Fastest_driver_int'] = 1
        telemetry.loc[telemetry['Fastest_driver'] == driver_2,
                        'Fastest_driver_int'] = 2
        # Get the x and y coordinates
        x = np.array(telemetry['X'].values)
        y = np.array(telemetry['Y'].values)

        # Convert the coordinates to points, and then concat them into segments
        points = np.array([x, y]).T.reshape(-1, 1, 2)
        segments = np.concatenate([points[:-1], points[1:]], axis=1)
        fastest_driver_array = telemetry['Fastest_driver_int'].to_numpy().astype(
            float)
        # The segments we just created can now be colored according to the fastest driver in a minisector
        cmap = ListedColormap([color_1, color_2])
        lc_comp = LineCollection(segments,
                                norm=plt.Normalize(1, cmap.N + 1),
                                cmap=cmap)
        lc_comp.set_array(fastest_driver_array)
        lc_comp.set_linewidth(5)
        # Create the plot
        plt.rcParams['figure.figsize'] = [18, 10]
        plt.title(f'Lap Comparison between {driver_1} and {driver_2}')
        plt.gca().add_collection(lc_comp)
        plt.axis('equal')
        plt.tick_params(labelleft=False, left=False, labelbottom=False, bottom=False)

        cbar = plt.colorbar(mappable=lc_comp,
                            boundaries=np.arange(1, 4),
                            ticks=[driver_1, driver_2])
        cbar.set_ticks(np.arange(1.5, 3.5))
        cbar.set_ticklabels([driver_1, driver_2])
        await ctx.send("Sending Lap Comparison")
        plt.savefig(f"lapcomparison.png")
        with open('lapcomparison.png', 'rb') as f:
            picture = discord.File(f)
            await ctx.send(file=picture)
        ff1.Cache.clear_cache('cache')
        os. system('rm -rf cache/*')