Skip to content

Conversation

@peripherium
Copy link

This PR aims to fix the bug described in issue #1817.

To start, I want to present a small program that triggers the described issue.
Let's call it issue1817testfile.c:

#include <stdio.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char **argv) {
    printf("hit return to modify cmdline\n");
    fgetc(stdin);

    memset(argv[0], 0, strlen(argv[0]));
    strcpy(argv[0], "[xyz 1/2/3]");
    printf("argv changed\n");

    while (1)
        sleep(1);
    return (0);
}

The described issue occurs in the code of LinuxProcessTable.c:

if (argChar == '/' && tokenEnd == (size_t)-1) {
   tokenStart = i + 1;
}

When the option "Show program path" is not enabled, this program is displayed
as "3]" in htop, instead of showing the full text.

Problem Explanation

The issue arises because we search for the / character to split the command
line, which works fine in most cases. However, the command line in
/proc/<pid>/cmdline can have the following formats:

  1. Absolute path: /home/foobar/path/to/executable
  2. Relative path: ./relative/path/to/executable
  3. Filename: relative/path/to/executable

The current implementation assumes that encountering a / is always a valid
separator for the path. This can cause problems when a program modifies
argv[0] , leading to incomplete or truncated commands being
displayed in htop.

Proposed Fix

The most efficient approach is to filter out cases where we deal with absolute
(/) or relative (./) paths. These are the most common command line formats.
We can treat these as paths that should be split by the / character.

For case (3) where no leading ./ is given (e.g., path/to/executable),
we can continue outputting the full string, acknowledging that this may occur
rarely. My argument is:
"It's better to show too much information than too little."

Alternatively, we could read the target from the symlink /proc/<pid>/exe
and compare it to the content of the first string in /proc/<pid>/cmdline:

For example, if /proc/<pid>/exe points to /home/foobar/path/to/exefile,
we would need to locate exefile in the first string of /proc/<pid>/cmdline
and set tokenStart to that index (if exefile is found in the string).

However, this would only solve another special case that practically never
or extremely rarely occurs. Therefore, I have ignored these
type 3 paths ("path/to/executable") for now.

Edge Cases

There is still a small chance that a program path could sneak into the
output due to unusual cases, but this is rare. It’s important to note that even
with this fix, some edge cases might still result in full path being shown.
Therefore, the goal here is to minimize errors, but we can't guarantee 100%
coverage for all possible edge cases.

But I think my patch results in the displayed paths being more useful to the
user in individual cases than the original code.

@Explorer09
Copy link
Contributor

I think your change is added in the wrong place.

Alternatively, we could read the target from the symlink /proc/<pid>/exe
and compare it to the content of the first string in /proc/<pid>/cmdline

Technically htop does this. It was just in a different routine.

@Explorer09
Copy link
Contributor

Also, the pull request description look like it was generated by AI. The use of AI tool needs to be disclosed.

@peripherium
Copy link
Author

I think your change is added in the wrong place.

I've inserted the change where tokenStart was originally changed due to
incorrect assumptions.

Technically htop does this. It was just in a different routine.

I just discovered that this actually happens in the functions
LinuxProcessList_readExe() and Process_makeCommandStr(). Thanks
for pointing that out. I haven't been working with htop code for very
long and was primarily looking for the cause of the described error.

Also, the pull request description look like it was generated by AI. The use of AI tool needs to be disclosed.

"Hey Siri, my htop isn't working properly.
Can you give me step-by-step instructions on how to..."

No, of course not. I confess that parts of the PR were written using Google
Translate, as English isn't my native language (I'm from Germany).
If the wording is a bit clunky in places, that's why.

If your suspicion stems from the fact that the PR looks like it came from an
"overly precise template," let me explain: I examined the code in search of
the bug, following the principle "if I can explain it, then my patch
is appropriate." Especially since my modification is so suspiciously small,
I wanted to explicitly state that I didn't just write something at random that
might produce the desired result.

@Explorer09
Copy link
Contributor

I think your change is added in the wrong place.

I've inserted the change where tokenStart was originally changed due to incorrect assumptions.

Your change there would apply to Linux only, and not other operating systems. And I suspect something is wrong.

Also, the pull request description look like it was generated by AI. The use of AI tool needs to be disclosed.

"Hey Siri, my htop isn't working properly. Can you give me step-by-step instructions on how to..."

No, of course not. I confess that parts of the PR were written using Google Translate, as English isn't my native language (I'm from Germany). If the wording is a bit clunky in places, that's why.

If your suspicion stems from the fact that the PR looks like it came from an "overly precise template," let me explain: I examined the code in search of the bug, following the principle "if I can explain it, then my patch is appropriate." Especially since my modification is so suspiciously small, I wanted to explicitly state that I didn't just write something at random that might produce the desired result.

The reasoning behind the change should be included in the commit description, not just on the PR description.

Also, cut out literal explanations of the code. We all can read the patch content itself.

An AI would often explain what's literally in the code in order to "fill the number of words" because otherwise it would have nothing else to draw information from. For programmers that are competent, such information is useless noise. That's why I can suspect about the AI use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants