A couple of things with the direct light calculations in get_direct_illumination and g

(you can verify it's the specular channel by setting <code class="notrans

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Direct light sampling about q2rtx HOT 11 CLOSED

nvidia commented on August 16, 2024

Direct light sampling

from q2rtx.

Comments (11)

apanteleev commented on August 16, 2024

Thank you for reporting this issue!
I started digging into it and fixed a few other things as well, see the commits mentioned above.
Dose anything else look wrong now?

from q2rtx.

DU-jdto commented on August 16, 2024

On line 180 of indirect_lighting.rgen, is_analytic_light shouldn't have the (spec_bounce_index == 0) condition. Without that condition, the global_ubo.pt_direct_polygon_lights condition should instead check global_ubo.pt_indirect_polygon_lights on the second bounce. Also, the final condition should be global_ubo.pt_xxxx_polygon_lights > 0, not >= 0.

With the asvgf_seed_rng.comp change, frame x checkerboard 1 will have the same rng as frame (x+1) checkerboard 0, which I think will produce duplicate samples when the scene is stationary. To avoid this issue, it should be frame_num * 2 + checkerboard, which is equivalent to the old code.

from q2rtx.

apanteleev commented on August 16, 2024

Makes sense - fixed but with slightly different interpretation of pt_xxxx_polygon_lights.

The RNG thing is not really a problem because although one checkerboard field will use the same sequence as the other on the previous frame, they process different sets of pixels, so overall there is no undersampling (I think). I don't see any image quality difference in real-time rendering mode, and this new behavior is definitely better for the reference accumulation mode because it generates 512 unique sequences over time instead of just 256.

from q2rtx.

SkacikPL commented on August 16, 2024

I'm not 100% sure here but i've been implementing recent changes in my fork and i think i see a definitive increase in temporal ghosting in indoor areas:

https://streamable.com/qfymi

Of course i might've screwed something on my part so it's worth checking out on your end.

from q2rtx.

apanteleev commented on August 16, 2024

Yes, with the new changes materials behave slightly differently, and you see more specular reflections. And the specular denoiser is not great here - just a temporal filter. The ghosting was not so visible before because the signal was just too dim compared to diffuse lighting. I'll try to come up with a solution for the ghosting / noise, and some materials need roughness tweaking too. You should probably not merge these changes until it's resolved.
(you can verify it's the specular channel by setting flt_scale_spec to 0)

from q2rtx.

SkacikPL commented on August 16, 2024

(you can verify it's the specular channel by setting flt_scale_spec to 0)

Yeah that seems to be the case.

from q2rtx.

DU-jdto commented on August 16, 2024

Makes sense - fixed but with slightly different interpretation of pt_xxxx_polygon_lights.

The RNG thing is not really a problem because although one checkerboard field will use the same sequence as the other on the previous frame, they process different sets of pixels, so overall there is no undersampling (I think).

Are you sure? The two checkerboard fields alternate between odd and even pixels on a frame by frame basis, so I would think there would be overlap. For instance, if my understanding is correct frame 0 checkerboard 0 position (0, 0) would correspond to pixel (0, 0) in the full frame while frame 0 checkerboard 1 position (0, 0) would correspond to pixel (1, 0), but frame 1 checkerboard 0 position (0, 0) would correspond to pixel (1, 0) and frame 1 checkerboard 1 position (0, 0) would correspond to pixel (0, 0).

EDIT: In my local fork, I fixed the issue of getting unique rng by keeping lines 55 and 56 unchanged but changing lines 53 and 54 to:

	rng_seed |= (uint(ipos.x + frame_num / (NUM_BLUE_NOISE_TEX / 2)) % BLUE_NOISE_RES) <<  0u;
	rng_seed |= (uint(ipos.y + frame_num / (NUM_BLUE_NOISE_TEX / 2)) % BLUE_NOISE_RES) << 10u;

For a while now I've actually had the reference mode accumulating 8192 samples, and it seems to work fine.

from q2rtx.

apanteleev commented on August 16, 2024

Are you sure? The two checkerboard fields alternate between odd and even pixels...

Yes they alternate, but the RNG seed was computed without the alternation...
In any case, the alternation is now removed from the real-time rendering mode because the result looks better overall, less noisy and sharper - see 2700c59.

For accumulation rendering, I implemented a version of your coordinate adjustment but with a change to remove the obvious diagonal noise patterns that appeared after a couple thousand frames.

Thanks again!

from q2rtx.

apanteleev commented on August 16, 2024

@SkacikPL regarding the specular noise/ghosting - see commit 96d70a6 : it's still a half measure, but that particular area looks much better now.

from q2rtx.

SkacikPL commented on August 16, 2024

Yeah, i just checked it out and it is better. Not exactly "perfectly playable" but also not entirely unplayable like it used to be.

I'm also glad some of my ideas managed to get on board too.

On a semi related note, i was experimenting with automating accumulation rendering for demos to achieve a prerendering functionality.
https://youtu.be/1ZIgTOwng3U

I'm not sure how useful it would be to a general user base but the concept isn't hard to implement.
I basically added cl_renderdemo cvar which determines whether demo should be rendered upon playback and cl_renderdemo_fps cvar to determine timestep between each frame. Then in CL_UpdateFrameTimes i added separate sync type on the bottom

	if (cl_renderdemo->integer && cls.demo.playback)
	{
		main_msec = fps_to_msec(cl_renderdemo_fps->integer);
		sync_mode = SYNC_FULL;
	}

Then in CL_Frame() directly under sync switches i added

	if (cls.demo.playback && cl_renderdemo->integer && cl_paused->integer != 2)
		main_extra = main_msec;

And set client time to tick only when unpaused

    if (!sv_paused->integer && !(cls.demo.playback && cl_renderdemo->integer && cl_paused->integer == 2)) {
        cl.time += main_extra;

Lastly in if(phys_frame) i added

	if (cls.demo.playback && cl_renderdemo->integer && cl_paused->integer != 2)
	{
		Cvar_Set("cl_paused", "2");
		CL_CheckForPause();
	}

This ensures fixed time step if demo is played with cl_renderdemo 1 and each frame is paused.

in vkpt\main.c i added

void stbi_writex(void *context, void *data, int size)
{
	FS_Write(data, size, (qhandle_t)(size_t)context);
}

#define IMG_SAVE(x) \
    static qerror_t IMG_Save##x(qhandle_t f, const char *filename, \
        byte *pic, int width, int height, int row_stride, int param)

IMG_SAVE(PNG)
{
	stbi_flip_vertically_on_write(1);
	int ret = stbi_write_png_to_func(stbi_writex, (void*)(size_t)f, width, height, 3, pic, row_stride);

	if (ret)
		return Q_ERR_SUCCESS;

	return Q_ERR_LIBRARY_ERROR;
}

static qhandle_t create_framedump(char *buffer, size_t size,
	const char *name, const char *ext)
{
	qhandle_t f;
	qerror_t ret;
	int i;

	if (name && *name) {
		// save to user supplied name
		return FS_EasyOpenFile(buffer, size, FS_MODE_WRITE,
			"screenshots/", name, ext);
	}

	// find a file name to save it to
	for (i = 0; i < 1000000; i++) {
		Q_snprintf(buffer, size, "screenshots/%s_%03d%s", cls.demo.file_name, i, ext);
		ret = FS_FOpenFile(buffer, &f, FS_MODE_WRITE | FS_FLAG_EXCL);
		if (f) {
			return f;
		}
		if (ret != Q_ERR_EXIST) {
			Com_EPrintf("Couldn't exclusively open %s for writing: %s\n",
				buffer, Q_ErrorString(ret));
			return 0;
		}
	}

	Com_EPrintf("Ran out of frame indexes!.\n");
	return 0;
}

static qboolean make_framedump(const char *name, const char *ext,
	qerror_t(*save)(qhandle_t, const char *, byte *, int, int, int, int),
	int param)
{
	char        buffer[MAX_OSPATH];
	byte        *pixels;
	qerror_t    ret;
	qhandle_t   f;
	int         w, h, rowbytes;

	f = create_framedump(buffer, sizeof(buffer), name, ext);
	if (!f) {
		return;
	}

	pixels = IMG_ReadPixels(&w, &h, &rowbytes);
	ret = save(f, buffer, pixels, w, h, rowbytes, param);
	FS_FreeTempMem(pixels);

	FS_FCloseFile(f);

	if (ret < 0) {
		Com_EPrintf("Couldn't write %s: %s\n", buffer, Q_ErrorString(ret));
		return qfalse;
	}
	else {
		return qtrue;
	}
}

And my entire evaluate_reference_mode looks like so

static void
evaluate_reference_mode(reference_mode_t* ref_mode)
{
	if (cl_paused->integer == 2 && cvar_pt_accumulation_rendering->integer > 0)
	{
		num_accumulated_frames++;

		const int num_warmup_frames = 5;
		const int num_frames_to_accumulate = cvar_pt_accumulation_rendering_framenum->integer;

		ref_mode->enable_accumulation = qtrue;
		ref_mode->enable_denoiser = qfalse;
		ref_mode->num_bounce_rays = 2;
		ref_mode->temporal_blend_factor = 1.0f / min(max(1, num_accumulated_frames - num_warmup_frames), num_frames_to_accumulate);

		switch (cvar_pt_accumulation_rendering->integer)
		{
		case 1: {
			float percentage = powf(max(0.f, (num_accumulated_frames - num_warmup_frames) / (float)num_frames_to_accumulate), 0.5f);
			if (percentage < 1.0f)
			{
				if (!cls.demo.playback)
				{
					char text[MAX_QPATH];
					Q_snprintf(text, sizeof(text), "Reference path tracing mode: accumulating samples... %d%%(%i)", (int)(min(1.f, percentage) * 100.f), num_accumulated_frames);

					int x = r_config.width / 4;
					int y = r_config.height / 4 - 50;
					R_SetScale(0.5f);
					R_SetColor(0xff000000u);
					SCR_DrawStringEx(x + 1, y + 1, UI_CENTER, MAX_QPATH, text, SCR_GetFont());
					R_SetColor(~0u);
					SCR_DrawStringEx(x, y, UI_CENTER, MAX_QPATH, text, SCR_GetFont());
					R_SetAlphaScale(1.f);
				}
			}
			else
			{
				SCR_SetHudAlpha(0.f);

				if (cl_renderdemo->integer)
				{
					qboolean result = make_framedump("", ".png", IMG_SavePNG, 0);

					if (result)
					{
						Cvar_Set("cl_paused", "0");
						CL_CheckForPause();

						num_accumulated_frames = 0;

						ref_mode->enable_accumulation = qfalse;
						ref_mode->enable_denoiser = !!cvar_flt_enable->integer;
						if (cvar_pt_num_bounce_rays->value == 0.5f)
							ref_mode->num_bounce_rays = 0.5f;
						else
							ref_mode->num_bounce_rays = max(0, min(2, round(cvar_pt_num_bounce_rays->value)));
						ref_mode->temporal_blend_factor = 0.f;
					}
					else
						CL_Disconnect(ERR_DISCONNECT);
				}
				break;
			}
		}
		case 2:
			SCR_SetHudAlpha(0.f);
			break;
		}
	}
	else
	{
		num_accumulated_frames = 0;

		ref_mode->enable_accumulation = qfalse;
		ref_mode->enable_denoiser = !!cvar_flt_enable->integer;
		if (cvar_pt_num_bounce_rays->value == 0.5f)
			ref_mode->num_bounce_rays = 0.5f;
		else
			ref_mode->num_bounce_rays = max(0, min(2, round(cvar_pt_num_bounce_rays->value)));
		ref_mode->temporal_blend_factor = 0.f;
	}
}

About half of that are dirty hacks which can probably be done much better but it gets the job done.
I managed to render a sample ~9 seconds of 3840x2160 video at 60 fps in about 5 hours on my 2070, where each frame had 300 frames worth of data.
Audio also has to be captured separately on normal demo run, whilst also maintaining target framerate but it's not too much hassle.

Aside that, it's pretty straightforward - just record any demo and play it back while cl_renderdemo is set to 1 and it will dump frames as demoname_XXX in screenshots folder.

from q2rtx.

DU-jdto commented on August 16, 2024

Yes they alternate, but the RNG seed was computed without the alternation...

That's true, but it only matters for resolutions for which (resX / 2) % 256 != 0, due to the % BLUE_NOISE_RES. Consider again the case of position (0, 0) checkerboard 0 vs position (0, 0) checkerboard 1. In the rng seed texture, the former maps to position (0, 0) while the latter maps to position (resX / 2, 0). Say the resolution is 2560x1440. Then (ipos.x % BLUE_NOISE_RES) will be 0 for the former case, but also 0 for the latter case (2560/2 % 256 == 0).

from q2rtx.

Direct light sampling about q2rtx HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent