Access_log if conditional shows URLs when it shouldn't?

My issue: I have an access map to skip logging images based on their filename extensions (ex: ~*.gif$), and some directories (ex: ~*^/sample/). It works for the most part, but I’ve noticed they get logged when I see a 403 or 404 error. The http_log_module doesn’t really explain ‘if’ in much detail. My guess is this has to do with some sort of internal redirect since they are errors? Is there more documentation on this that I’m missing?

Solutions I’ve tried: I’m not sure what I can really try without understanding the issue deeper. Would I have to add additional map conditional logic based on file type AND response code?

Version of NGINX or NGINX adjacent software (e.g. NGINX Gateway Fabric): 1.28.0

Deployment environment: Ubuntu 24.04

Basic example:

map $uri $skip_image_req {
    ~*\.bmp$ 0;
    ~*\.gif$ 0;
    ~*\.jpg$ 0;
    ~*\.png$ 0;
    ~*^/sample/ 0;
    default 1;
}

access_log logs/access_log combined if=$skip_image_req;

Hey @Jas0n! Could you share your entire NGINX config? I’ll try to recreate this on my end when I get a chance.

Sure, thanks! I have a file called “log_custom.conf” in my conf.d folder. That folder gets included from the main nginx.conf file within the http{} section.

### Inverse logic to NOT log images under any circumstances. ###
map $uri $skip_image_req {
    ~*\.bmp$ 0;
    ~*\.css$ 0;
    ~*\.gif$ 0;
    ~*\.ico$ 0;
    ~*\.jpg$ 0;
    ~*\.js$  0;
    ~*\.json$ 0;
    ~*\.png$ 0;
    ~*\.svg$ 0;
    ~*^/sigs/ 0;
    default 1;
}

# Only if NOT an image...
map $http_user_agent $good_bot_req {
    "~*Googlebot" $skip_image_req;
    "~*APIs-Google" $skip_image_req;
    "~*AdsBot-Google" $skip_image_req;
    "~*Mediapartners-Google" $skip_image_req;
    "~*FeedFetcher-Google" $skip_image_req;
    "~*Google-adstxt" $skip_image_req;
    "~*Google-Read-Aloud" $skip_image_req;
    "~*Google-SearchByImage" $skip_image_req;
    "~*GoogleDocs" $skip_image_req;
    "~*GoogleOther" $skip_image_req;
    "~*Chrome Privacy Preserving Prefetch Proxy" $skip_image_req;
    "~*Yahoo! Slurp" $skip_image_req;
    "~*Yahoo Ad monitoring" $skip_image_req;
    "~*Bingbot" $skip_image_req;
    "~*BingPreview" $skip_image_req;
    "~*AdidxBot" $skip_image_req;
    "~*MSNBot" $skip_image_req;
    "~*contxbot" $skip_image_req;
    "~*Applebot" $skip_image_req;
    "~*DuckDuckBot" $skip_image_req;
    "~*Qwantify" $skip_image_req;
    "~*Yandex" $skip_image_req;
    "~*Baiduspider" $skip_image_req;
    "~*Sogou" $skip_image_req;
    "~*DowntimeDetector" $skip_image_req;
    default 0;
}

# Only if NOT a bot, or image...
map $good_bot_req $normal_req {
    1       0;
    default $skip_image_req;
}

Then within the server{} block of my websites I have log lines like this, and also custom error page, for example (haven’t tried disabling yet to see what happens):

access_log /var/log/nginx/site1/access_log combined if=$normal_req;
access_log /var/log/nginx/site1/good_bot_access_log combined if=$good_bot_req;

error_page 403 404 503 /error_page.php;

Everything is being served from Nginx (no apache proxy). I’ve noticed that both static images and also php pages will appear in the logs when they shouldn’t. Including those relevant lines too from the server{} block:

location / {
    try_files $uri $uri/ =404;
}

location ~ \.php(?:/.*)?$ {
    include snippets/fastcgi-php8_4.conf;
}

location ~* \.(?:bmp|bz2|cab|css|exe|gif|gz|ico|iso|jpg|js(?:on)?|msi|pdf|png|psd|rar|rm|rtf|svg|tar|xml|zip)$ {
    if ($invalid_referer) { return 403; }
    expires 2w;
    add_header Cache-Control "public, must-revalidate";

    # Use files with .br or .gz instead of on-the-fly compression.
    gzip_static on;
    brotli_static on;

    try_files $uri =404;
}

Did you get a chance to try for yourself?

Not yet!

Do PHP files also only show up when they return an error?

I have a php file in my /sigs/ directory that generates / outputs a signature image. The /sigs/ directory is in that first exclusion map along with the images. When it generates a 403/404 it will show up in the regular log. I just included it for completeness as it’s more than just basic static images that show up in the logs with the 403/404 errors.

Regular php files are not excluded by extension or anything, their requests are logged as normal.

So at a very simple level (based on your first post), things seem to be working as expected – see https://tech-playground.com/snippet/purring-otter-of-drizzle/. (No matter what you try to query it’s going to return a 404 error, but if you query an image you’ll notice it doesn’t get logged.)

Edit: I have gone ahead and added slightly simplified versions of your other map blocks and everything still seems to be working as expected on the playground above. However, when I add try_files, it starts logging all requests.

Alessandro asked me to have a look.

What is logged in the default log format is “request_uri”. But you use “uri” for your map. And uri doc states that it can be changed.

create a custom log format, include request_uri and uri and give it a try. Most likely, by the time we are logging anything uri has changed

Okay, I’ll give it a try with request_uri in just a bit and report back!

Interesting that it only occurred with try_files, which also the fastcgi snippet has try_files in there so that tracks with both images & the php file under the path that should be excluded.

Okay, that looks like it solved the problem switching to $request_uri yay! Thank you so very much oxpa & alessandro! :+1:

I was almost sure it was some internal rewriting going on somewhere, but I was looking in all the wrong places when it was right there in front of my face! I even knew the difference between $uri and $request_uri but for some reason it just didn’t click, guess I’ve been working on too many different things at the same time and my brain is scrambled.

whew crisis averted!

Glad to hear you managed to figure it out!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.