|
- \input texinfo @c -*-texinfo-*-
- @c %**start of header
- @setfilename find-maint.info
- @include versionmaint.texi
- @settitle Maintaining GNU Findutils @value{VERSION}
- @c For double-sided printing, uncomment:
- @c @setchapternewpage odd
- @c %**end of header
- @iftex
- @finalout
- @end iftex
- @dircategory GNU organization
- @direntry
- * Maintaining Findutils: (find-maint). Maintaining GNU findutils
- @end direntry
- @copying
- This manual explains how GNU findutils is maintained, how changes should
- be made and tested, and what resources exist to help developers.
- This document corresponds to version @value{VERSION} of the GNU findutils.
- Copyright @copyright{} 2007--2021 Free Software Foundation, Inc.
- @quotation
- Permission is granted to copy, distribute and/or modify this document
- under the terms of the GNU Free Documentation License, Version 1.3 or
- any later version published by the Free Software Foundation; with no
- Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
- A copy of the license is included in the section entitled
- ``GNU Free Documentation License''.
- @end quotation
- @end copying
- @titlepage
- @title Maintaining GNU Findutils
- @subtitle version @value{VERSION}, @value{UPDATED}
- @author by James Youngman
- @page
- @vskip 0pt plus 1filll
- @insertcopying
- @end titlepage
- @contents
- @ifnottex
- @node Top, Introduction, (dir), (dir)
- @top Maintaining GNU Findutils
- @insertcopying
- @end ifnottex
- @menu
- * Introduction::
- * Maintaining GNU Programs::
- * Design Issues::
- * Coding Conventions::
- * Tools::
- * Using the GNU Portability Library::
- * Documentation::
- * Testing::
- * Bugs::
- * Distributions::
- * Internationalisation::
- * Security::
- * Making Releases::
- * GNU Free Documentation License::
- @end menu
- @node Introduction
- @chapter Introduction
- This document explains how to contribute to and maintain GNU
- Findutils. It concentrates on developer-specific issues. For
- information about how to use the software please refer to
- @xref{Introduction, ,Introduction,find,The Findutils manual}.
- This manual aims to be useful without necessarily being verbose. It's
- also a recent document, so there will be a many areas in which
- improvements can be made. If you find that the document misses out
- important information or any part of the document is be so terse as to
- be unuseful, please ask for help on the @email{bug-findutils@@gnu.org}
- mailing list. We'll try to improve this document too.
- @node Maintaining GNU Programs
- @chapter Maintaining GNU Programs
- GNU Findutils is part of the GNU Project and so there are a number of
- documents which set out standards for the maintenance of GNU
- software.
- @table @file
- @item standards.texi
- GNU Project Coding Standards. All changes to findutils should comply
- with these standards. In some areas we go somewhat beyond the
- requirements of the standards, but these cases are explained in this
- manual.
- @item maintain.texi
- Information for Maintainers of GNU Software. This document provides
- guidance for GNU maintainers. Everybody with commit access should
- read this document. Everybody else is welcome to do so too, of
- course.
- @end table
- @node Design Issues
- @chapter Design Issues
- The findutils package is installed on many many systems, usually as a
- fundamental component. The programs in the package are often used in
- order to successfully boot or fix the system.
- This fact means that for findutils we bear in mind considerations that
- may not apply so much as for other packages. For example, the fact
- that findutils is often a base component motivates us to
- @itemize
- @item Limit dependencies on libraries
- @item Avoid dependencies on other large packages (for example, interpreters)
- @item Be conservative when making changes to the 'stable' release branch
- @end itemize
- All those considerations come before functionality. Functional
- enhancements are still made to findutils, but these are almost
- exclusively introduced in the 'development' release branch, to allow
- extensive testing and proving.
- Sometimes it is useful to have a priority list to provide guidance
- when making design trade-offs. For findutils, that priority list is:
- @enumerate
- @item Correctness
- @item Standards compliance
- @item Security
- @item Backward compatibility
- @item Performance
- @item Functionality
- @end enumerate
- For example, we support the @code{-exec} action because POSIX
- compliance requires this, even though there are security problems with
- it and we would otherwise prefer people to use @code{-execdir}. There
- are also cases where some performance is sacrificed in the name of
- security. For example, the sanity checks that @code{find} performs
- while traversing a directory tree may slow it down. We adopt
- functional changes, and functional changes are allowed to make
- @code{find} slower, but only if there is no detectable impact on users
- who don't use the feature.
- Backward-incompatible changes do get made in order to comply with
- standards (for example the behaviour of @code{-perm -...} changed in
- order to comply with POSIX). However, they don't get made in order to
- provide better ease of use; for example the semantics of @code{-size
- -2G} are almost always unexpected by users, but we retain the current
- behaviour because of backward compatibility and for its similarity to
- the block-rounding behaviour of @code{-size -30}. We might introduce
- a change which does not have the unfortunate rounding behaviour, but
- we would choose another syntax (for example @code{-size '<2G'}) for
- this.
- In a general sense, we try to do test-driven development of the
- findutils code; that is, we try to implement test cases for new
- features and bug fixes before modifying the code to make the test
- pass. Some features of the code are tested well, but the test
- coverage for other features is less good. If you are about to modify
- the code for a predicate and aren't sure about the test coverage, use
- @code{grep} on the test directories and measure the coverage with
- @code{lcov} or another test coverage tool.
- You should be able to use the @code{coverage} Makefile target (it's
- defined in @code{maint.mk} to generate a test coverage report for
- findutils. Due to limitations in @code{lcov}, this only works if
- your build directory is the same asthe source directory (that is,
- you're not using a VPATH build configuration).
- Lastly, we try not to depend on having a ``working system''. The
- findutils suite is used for diagnosis of problems, and this applies
- especially to @code{find}. We should ensure that @code{find} still
- works on relatively broken systems, for example systems with damaged
- @file{/etc/passwd} or @code{/etc/fstab} files. Another interesting
- example is the case where a system is a client of one or more
- unresponsive NFS servers. On such a system, if you try to stat all
- mount points, your program will hang indefinitely, waiting for the
- remote NFS server to respond.
- Another interesting but unusual case is broken NFS servers and corrupt
- filesystems; sometimes they return `impossible' file modes. It's
- important that find does not entirely fail when encountering such a
- file.
- @node Coding Conventions
- @chapter Coding Conventions
- Coding style documents which set out to establish a uniform look and
- feel to source code have worthy goals, for example greater ease of
- maintenance and readability. However, I do not believe that in
- general coding style guide authors can envisage every situation, and
- it is always possible that it might on occasion be necessary to break
- the letter of the style guide in order to honour its spirit, or to
- better achieve the style guide's goals.
- I've certainly seen many style guides outside the free software world
- which make bald statements such as ``functions shall have exactly one
- return statement''. The desire to ensure consistency and obviousness
- of control flow is laudable, but it is all too common for such bald
- requirements to be followed unthinkingly. Certainly I've seen such
- coding standards result in unmaintainable code with terrible
- infelicities such as functions containing @code{if} statements nested
- nine levels deep. I suppose such coding standards don't survive in
- free software projects because they tend to drive away potential
- contributors or tend to generate heated discussions on mailing lists.
- Equally, a nine-level-deep function in a free software program would
- quickly get refactored, assuming it is obvious what the function is
- supposed to do...
- Be that as it may, the approach I will take for this document is to
- explain some idioms and practices in use in the findutils source code,
- and leave it up to the reader's engineering judgement to decide which
- considerations apply to the code they are working on, and whether or
- not there is sufficient reason to ignore the guidance in current
- circumstances.
- @menu
- * Make the Compiler Find the Bugs::
- * Factor Out Repeated Code::
- * Debugging is For Users Too::
- * Don't Trust the File System Contents::
- * The File System Is Being Modified::
- @end menu
- @node Make the Compiler Find the Bugs
- @section Make the Compiler Find the Bugs
- Finding bugs is tedious. If I have a filesystem containing two
- million files, and a find command line should print one million of
- them, but in fact it misses out 1%, you can tell the program is
- printing the wrong result only if you know the right answer for that
- filesystem at that time. If you don't know this, you may just not
- find out about that bug. For this reason it is important to have a
- comprehensive test suite.
- The test suite is of course not the only way to find the bugs. The
- findutils source code makes liberal use of the assert macro. While on
- the one hand these might be a performance drain, the performance
- impact of most of these is negligible compared to the time taken to
- fetch even one sector from a disk drive.
- Assertions should not be used to check the results of operations which
- may be affected by the program's external environment. For example,
- never assert that a file could be opened successfully. Errors
- relating to problems with the program's execution environment should
- be diagnosed with a user-oriented error message. An assertion failure
- should always denote a bug in the program.
- Avoid using @code{assert} to mark not-fully-implemented features of
- your code as such. Finish the implementation, disable the code, or
- leave the unfinished version on a local branch.
- Several programs in the findutils suite perform self-checks. See for
- example the function @code{pred_sanity_check} in @file{find/pred.c}.
- This is generally desirable.
- There are also a number of small ways in which we can help the
- compiler to find the bugs for us.
- @subsection Constants in Equality Testing
- It's a common error to write @code{=} when @code{==} is meant.
- Sometimes this happens in new code and is simply due to finger
- trouble. Sometimes it is the result of the inadvertent deletion of a
- character. In any case, there is a subset of cases where we can
- persuade the compiler to generate an error message when we make this
- mistake; this is where the equality test is with a constant.
- This is an example of a vulnerable piece of code.
- @example
- if (x == 2)
- ...
- @end example
- A simple typo converts the above into
- @example
- if (x = 2)
- ...
- @end example
- We've introduced a bug; the condition is always true, and the value of
- @code{x} has been changed. However, a simple change to our practice
- would have made us immune to this problem:
- @example
- if (2 == x)
- ...
- @end example
- Usually, the Emacs keystroke @kbd{M-t} can be used to swap the operands.
- @subsection Spelling of ASCII NUL
- Strings in C are just sequences of characters terminated by a NUL.
- The ASCII NUL character has the numerical value zero. It is normally
- represented in C code as @samp{\0}. Here is a typical piece of C
- code:
- @example
- *p = '\0';
- @end example
- Consider what happens if there is an unfortunate typo:
- @example
- *p = '0';
- @end example
- We have changed the meaning of our program and the compiler cannot
- diagnose this as an error. Our string is no longer terminated. Bad
- things will probably happen. It would be better if the compiler could
- help us diagnose this problem.
- In C, the type of @code{'\0'} is in fact int, not char. This provides
- us with a simple way to avoid this error. The constant @code{0} has
- the same value and type as the constant @code{'\0'}. However, it is
- not as vulnerable to typos. For this reason I normally prefer to
- use this code:
- @example
- *p = 0;
- @end example
- @node Factor Out Repeated Code
- @section Factor Out Repeated Code
- Repeated code imposes a greater maintenance burden and increases the
- exposure to bugs. For example, if you discover that something you
- want to implement has some similarity with an existing piece of code,
- don't cut and paste it. Instead, factor the code out. The risk of
- cutting and pasting the code, particularly if you do this several
- times, is that you end up with several copies of the same code.
- If the original code had a bug, you now have N places where this needs
- to be fixed. It's all to easy to miss some out when trying to fix the
- bug. Equally, it's quite possible that when pasting the code into
- some function, the pasted code was not quite adapted correctly to its
- new environment. To pick a contrived example, perhaps it modifies a
- global variable which it (that [original] code) shouldn't be touching
- in its new home. Worse, perhaps it makes some unstated assumption about
- the nature of the input arguments which is in fact not true for the
- context of the now duplicated code.
- A good example of the use of refactoring in findutils is the
- @code{collect_arg} function in @file{find/parser.c}. A less clear-cut
- but larger example is the factoring out of code which would otherwise
- have been duplicated between @file{find/oldfind.c} and
- @code{find/ftsfind.c}.
- The findutils test suite is comprehensive enough that refactoring code
- should not generally be a daunting prospect from a testing point of
- view. Nevertheless there are some areas which are only
- lightly-tested:
- @enumerate
- @item Tests on the ages of files
- @item Code which deals with the values returned by operating system calls (for example handling of ENOENT)
- @item Code dealing with OS limits (for example, limits on path length
- or exec arguments)
- @item Code relating to features not all systems have (for example
- Solaris Doors)
- @end enumerate
- Please exercise caution when working in those areas.
- @node Debugging is For Users Too
- @section Debugging is For Users Too
- Debug and diagnostic code is often used to verify that a program is
- working in the way its author thinks it should be. But users are
- often uncertain about what a program is doing, too. Exposing them a
- little more diagnostic information can help. Much of the diagnostic
- code in @code{find}, for example, is controlled by the @samp{-D} flag,
- as opposed to C preprocessor directives.
- Making diagnostic messages available to users also means that the
- phrasing of the diagnostic messages becomes important, too.
- @node Don't Trust the File System Contents
- @section Don't Trust the File System Contents
- People use @code{find} to search in directories created by other
- people. Sometimes they do this to check to suspicious activity (for
- example to look for new setuid binaries). This means that it would be
- bad if @code{find} were vulnerable to, say, a security problem
- exploitable by constructing a specially-crafted filename. The same
- consideration would apply to @code{locate} and @code{updatedb}.
- Henry Spencer said this well in his fifth commandment:
- @quotation
- Thou shalt check the array bounds of all strings (indeed, all arrays),
- for surely where thou typest @samp{foo} someone someday shall type
- @samp{supercalifragilisticexpialidocious}.
- @end quotation
- Symbolic links can often be a problem. If @code{find} calls
- @code{lstat} on something and discovers that it is a directory, it's
- normal for @code{find} to recurse into it. Even if the @code{chdir}
- system call is used immediately, there is still a window of
- opportunity between the @code{lstat} and the @code{chdir} in which a
- malicious person could rename the directory and substitute a symbolic
- link to some other directory.
- @node The File System Is Being Modified
- @section The File System Is Being Modified
- The filesystem gets modified while you are traversing it. For,
- example, it's normal for files to get deleted while @code{find} is
- traversing a directory. Issuing an error message seems helpful when a
- file is deleted from the one directory you are interested in, but if
- @code{find} is searching 15000 directories, such a message becomes
- less helpful.
- Bear in mind also that it is possible for the directory @code{find} is
- searching to be concurrently moved elsewhere in the file system,
- and that the directory in which @code{find} was started could be
- deleted.
- Henry Spencer's sixth commandment is also apposite here:
- @quotation
- If a function be advertised to return an error code in the event of
- difficulties, thou shalt check for that code, yea, even though the
- checks triple the size of thy code and produce aches in thy typing
- fingers, for if thou thinkest ``it cannot happen to me'', the gods
- shall surely punish thee for thy arrogance.
- @end quotation
- There are a lot of files out there. They come in all dates and
- sizes. There is a condition out there in the real world to exercise
- every bit of the code base. So we try to test that code base before
- someone falls over a bug.
- @node Tools
- @chapter Tools
- Most of the tools required to build findutils are mentioned in the
- file @file{README-hacking}. We also use some other tools:
- @table @asis
- @item System call traces
- Much of the execution time of find is spent waiting for filesystem
- operations. A system call trace (for example, that provided by
- @code{strace}) shows what system calls are being made. Using this
- information we can work to remove unnecessary file system operations.
- @item Valgrind
- Valgrind is a tool which dynamically verifies the memory accesses a
- program makes to ensure that they are valid (for example, that the
- behaviour of the program does not in any way depend on the contents of
- uninitialized memory).
- @item DejaGnu
- DejaGnu is the test framework used to run the findutils test suite
- (the @code{runtest} program is part of DejaGnu). It would be ideal if
- everybody building @code{findutils} also ran the test suite, but many
- people don't have DejaGnu installed. When changes are made to
- findutils, DejaGnu is invoked a lot. @xref{Testing}, for more
- information.
- @end table
- @node Using the GNU Portability Library
- @chapter Using the GNU Portability Library
- The Gnulib library (@url{https://www.gnu.org/software/gnulib/}) makes a
- variety of systems look more like a GNU/Linux system and also applies
- a bunch of automatic bug fixes and workarounds. Some of these also
- apply to GNU/Linux systems too. For example, the Gnulib regex
- implementation is used when we determine that we are building on a
- GNU libc system with a bug in the regex implementation.
- @section How and Why we Import the Gnulib Code
- Gnulib does not have a release process which results in a source
- tarball you can download. Instead, the code is simply made available
- by GIT, so we import gnulib via the submodule feature. The bootstrap
- script performs the necessary steps.
- Findutils does not use all the Gnulib code. The modules we need are
- listed in the file @file{bootstrap.conf}.
- The upshot of all this is that we can use the findutils git repository
- to track which version of Gnulib every findutils release uses.
- A small number of files are installed by automake and will therefore
- vary according to which version of automake was used to generate a
- release. This includes for example boiler-plate GNU files such as
- @file{ABOUT-NLS}, @file{INSTALL} and @file{COPYING}.
- @section How We Fix Gnulib Bugs
- Gnulib is used by quite a number of GNU projects, and this means that
- it gets plenty of testing. Therefore there are relatively few bugs in
- the Gnulib code, but it does happen from time to time.
- However, since there is no waiting around for a Gnulib source release
- tarball, Gnulib bugs are generally fixed quickly. Here is an outline
- of the way we would contribute a fix to Gnulib (assuming you know it
- is not already fixed in the current Gnulib git tree):
- @table @asis
- @item Check you already completed a copyright assignment for Gnulib
- @item Begin with a vanilla git tree
- Download the Findutils source code from git (or use the tree you have
- already)
- @item Run the bootstrap script
- @item Run configure
- @item Build findutils
- Build findutils and run the test suite, which should pass. In our
- example we assume you have just noticed a bug in Gnulib, not that
- recent Gnulib changes broke the findutils regression tests.
- @item Write a test case
- If in fact Gnulib did break the findutils regression tests, you can probably
- skip this step, since you already have a test case demonstrating the problem.
- Otherwise, write a findutils test case for the bug and/or a Gnulib test case.
- @item Fix the Gnulib bug
- Make sure your editor follows symbolic links so that your changes to
- @file{gnulib/...} actually affect the files in the git working
- directory you checked out earlier. Observe that your test now passes.
- @item Prepare a Gnulib patch
- In the gnulib subdirectory, use @code{git format-patch} to prepare the
- patch. Follow the normal usage for checkin comments (take a look at
- the output of @code{git log}). Check that the patch conforms with the
- GNU coding standards, and email it to the Gnulib mailing list.
- @item Wait for the patch to be applied
- Once your bug fix has been applied, you can update your gnulib
- directory from git, and then check in the change to the submodule as
- normal (you can check @code{git help submodule} for details).
- @end table
- There is an alternative to the method above; it is possible to store
- local diffs to be patched into gnulib beneath the
- @file{gnulib-local}. Normally however, there is no need for this,
- since gnulib updates are very prompt.
- @section How to update Gnulib to latest
- With a non-dirty working tree, the command @code{make update-gnulib-to-latest}
- (or the shorter alias @code{make gnulib-sync} allows, well, to update the
- gnulib submodule. In detail, that is:
- @enumerate
- @item Fetching the latest upstream gnulib reference.
- @item Copying the files which should stay in sync like
- @file{bootstrap} from gnulib into the findutils working tree.
- @item And finally showing the @code{git status} for the gnulib submodule
- and the above copied files.
- @end enumerate
- After that, the maintainer compares if all is correct, if the findutils build
- and run correct, and finally commits with the new gnulib version, e.g. via
- @code{git gui}.
- The @code{gnulib-sync} target can be run any time - after a @code{configure}
- run -, and only rejects to run if the working tree is dirty.
- @node Documentation
- @chapter Documentation
- The findutils git tree includes several different types of
- documentation.
- @section git change log
- The git change log for the source tree contains check-in messages
- which describe each check-in. These have a standard format:
- @smallexample
- Summary of the change.
- (ChangeLog-style detail)
- @end smallexample
- Here, the format of the detail part follows the standard GNU ChangeLog
- style, but without whitespace in the left margin and without
- author/date headers. Take a look at the output of @code{git log} to
- see some examples. The README-hacking file also contains an example
- with an explanation.
- @section User Documentation
- User-oriented documentation is provided as manual pages and in
- Texinfo. See
- @ref{Introduction,,Introduction,find,The Findutils manual}.
- Please make sure both sets of documentation are updated if you make a
- change to the code. The GNU coding standards do not normally call for
- maintaining manual pages on the grounds of effort duplication.
- However, the manual page format is more convenient for quick
- reference, and so it's worth maintaining both types of documentation.
- However, the manual pages are normally rather more terse than the
- Texinfo documentation. The manual pages are suitable for reference
- use, but the Texinfo manual should also include introductory and
- tutorial material.
- We make the user documentation available on the web, on the GNU
- project web site. These web pages are source-controlled via CVS
- (still!). If you are a member of the @samp{findutils} project on
- Savannah you should be able to check the web pages out like this
- (@samp{$USER} is a placeholder for your Savannah username):
- @smallexample
- cvs -d :ext:$USER@@cvs.savannah.gnu.org:/web/findutils checkout findutils/manual
- @end smallexample
- You can automatically update the documentation in this repository
- using the script @samp{build-aux/update-online-manual.sh} in the
- findutils Git repository.
- @section Build Guidance
- @table @file
- @item ABOUT-NLS
- Describes the Free Translation Project, the translation status of
- various GNU projects, and how to participate by translating an
- application.
- @item AUTHORS
- Lists the authors of findutils.
- @item COPYING
- The copyright license covering findutils; currently, the GNU GPL,
- version 3.
- @item INSTALL
- Generic installation instructions for installing GNU programs.
- @item README
- Information about how to compile findutils in particular
- @item README-hacking
- Describes how to build findutils from the code in git.
- @item THANKS
- Thanks for people who contributed to findutils. Generally, if
- someone's contribution was significant enough to need a copyright
- assignment, their name should go in here.
- @item TODO
- Mainly obsolete. Please add bugs to the Savannah bug tracker instead
- of adding entries to this file.
- @end table
- @section Release Information
- @table @file
- @item NEWS
- Enumerates the user-visible change in each release. Typical changes
- are fixed bugs, functionality changes and documentation changes.
- Include the date when a release is made.
- @item ChangeLog
- This file enumerates all changes to the findutils source code (with
- the possible exception of @file{.cvsignore} and @code{.gitignore}
- changes). The level of detail used for this file should be sufficient
- to answer the questions ``what changed?'' and ``why was it changed?''.
- The file is generated from the git commit messages during @code{make dist}.
- If a change fixes a bug, always give the bug reference number in the
- @file{NEWS} file and of course also in the checkin message.
- In general, it should be possible to enumerate all
- material changes to a function by searching for its name in
- @file{ChangeLog}. Mention when each release is made.
- @end table
- @node Testing
- @chapter Testing
- This chapter will explain the general procedures for adding tests to
- the test suite, and the functions defined in the findutils-specific
- DejaGnu configuration. Where appropriate references will be made to
- the DejaGnu documentation.
- @node Bugs
- @chapter Bugs
- Bugs are logged in the Savannah bug tracker
- @url{https://savannah.gnu.org/bugs/?group=findutils}. The tracker
- offers several fields but their use is largely obvious. The
- life-cycle of a bug is like this:
- @table @asis
- @item Open
- Someone, usually a maintainer, a distribution maintainer or a user,
- creates a bug by filling in the form. They fill in field values as
- they see fit. This will generate an email to
- @email{bug-findutils@@gnu.org}.
- @item Triage
- The bug hangs around with @samp{Status=None} until someone begins to
- work on it. At that point they set the ``Assigned To'' field and will
- sometimes set the status to @samp{In Progress}, especially if the bug
- will take a while to fix.
- @item Non-bugs
- Quite a lot of reports are not actually bugs; for these the usual
- procedure is to explain why the problem is not a bug, set the status
- to @samp{Invalid} and close the bug. Make sure you set the
- @samp{Assigned to} field to yourself before closing the bug.
- @item Fixing
- When you commit a bug fix into git (or in the case of a contributed
- patch, commit the change), mark the bug as @samp{Fixed}. Make sure
- you include a new test case where this is relevant. If you can figure
- out which releases are affected, please also set the @samp{Release}
- field to the earliest release which is affected by the bug.
- Indicate which source branch the fix is included in (for example,
- 4.2.x or 4.3.x). Don't close the bug yet.
- @item Release
- When a release is made which includes the bug fix, make sure the bug
- is listed in the NEWS file. Once the release is made, fill in the
- @samp{Fixed Release} field and close the bug.
- @end table
- @node Distributions
- @chapter Distributions
- Almost all GNU/Linux distributions include findutils, but only some of
- them have a package maintainer who is a member of the mailing list.
- Distributions don't often feed back patches to the
- @email{bug-findutils@@gnu.org} list, but on the other hand many of
- their patches relate only to standards for file locations and so
- forth, and are therefore distribution specific. On an irregular basis
- I check the current patches being used by one or two distributions,
- but the total number of GNU/Linux distributions is large enough that
- we could not hope to cover them all.
- Often, bugs are raised against a distribution's bug tracker instead of
- GNU's. Periodically (about every six months) I take a look at some
- of the more accessible bug trackers to indicate which bugs have been
- fixed upstream.
- Many distributions include both findutils and the slocate package,
- which provides a replacement @code{locate}.
- @node Internationalisation
- @chapter Internationalisation
- Translation is essentially automated from the maintainer's point of
- view. The TP mails the maintainer when a new PO file is available,
- and we just download it and check it in. The @file{bootstrap} script
- copies @file{.po} files into the working tree. For more information,
- please see
- @url{https://translationproject.org/domain/findutils.html}.
- @node Security
- @chapter Security
- See @ref{Security Considerations, ,Security Considerations,find,The
- Findutils manual}, for a full description of the findutils approach to
- security considerations and discussion of particular tools.
- If someone reports a security bug publicly, we should fix this as
- rapidly as possible. If necessary, this can mean issuing a fixed
- release containing just the one bug fix. We try to avoid issuing
- releases which include both significant security fixes and functional
- changes.
- Where someone reports a security problem privately, we generally try
- to construct and test a patch without pushing the intermediate code to
- the public repository.
- Once everything has been tested, this allows us to make a release and
- push the patch. The advantage of doing things this way is that we
- avoid situations where people watching for git commits can figure out
- and exploit a security problem before a fixed release is available.
- It's important that security problems be fixed promptly, but don't
- rush so much that things go wrong. Make sure the new release really
- fixes the problem. It's usually best not to include functional
- changes in your security-fix release.
- If the security problem is serious, send an alert to
- @email{vendor-sec@@lst.de}. The members of the list include most
- GNU/Linux distributions. The point of doing this is to allow them to
- prepare to release your security fix to their customers, once the fix
- becomes available. Here is an example alert:-
- @smallexample
- GNU findutils heap buffer overrun (potential privilege escalation)
- I. BACKGROUND
- =============
- GNU findutils is a set of programs which search for files on Unix-like
- systems. It is maintained by the GNU Project of the Free Software
- Foundation. For more information, see
- @url{https://www.gnu.org/software/findutils}.
- II. DESCRIPTION
- ===============
- When GNU locate reads filenames from an old-format locate database,
- they are read into a fixed-length buffer allocated on the heap.
- Filenames longer than the 1026-byte buffer can cause a buffer overrun.
- The overrunning data can be chosen by any person able to control the
- names of filenames created on the local system. This will normally
- include all local users, but in many cases also remote users (for
- example in the case of FTP servers allowing uploads).
- III. ANALYSIS
- =============
- Findutils supports three different formats of locate database, its
- native format "LOCATE02", the slocate variant of LOCATE02, and a
- traditional ("old") format that locate uses on other Unix systems.
- When locate reads filenames from a LOCATE02 database (the default
- format), the buffer into which data is read is automatically extended
- to accommodate the length of the filenames.
- This automatic buffer extension does not happen for old-format
- databases. Instead a 1026-byte buffer is used. When a longer
- pathname appears in the locate database, the end of this buffer is
- overrun. The buffer is allocated on the heap (not the stack).
- If the locate database is in the default LOCATE02 format, the locate
- program does perform automatic buffer extension, and the program is
- not vulnerable to this problem. The software used to build the
- old-format locate database is not itself vulnerable to the same
- attack.
- Most installations of GNU findutils do not use the old database
- format, and so will not be vulnerable.
- IV. DETECTION
- =============
- Software
- --------
- All existing releases of findutils are affected.
- Installations
- -------------
- To discover the longest path name on a given system, you can use the
- following command (requires GNU findutils and GNU coreutils):
- @verbatim
- find / -print0 | tr -c '\0' 'x' | tr '\0' '\n' | wc -L
- @end verbatim
- V. EXAMPLE
- ==========
- This section includes a shell script which determines which of a list
- of locate binaries is vulnerable to the problem. The shell script has
- been tested only on glibc based systems having a mktemp binary.
- NOTE: This script deliberately overruns the buffer in order to
- determine if a binary is affected. Therefore running it on your
- system may have undesirable effects. We recommend that you read the
- script before running it.
- @verbatim
- #! /bin/sh
- set +m
- if vanilla_db="$(mktemp nicedb.XXXXXX)" ; then
- if updatedb --prunepaths="" --old-format --localpaths="/tmp" \
- --output="$@{vanilla_db@}" ; then
- true
- else
- rm -f "$@{vanilla_db@}"
- vanilla_db=""
- echo "Failed to create old-format locate database; skipping the sanity checks" >&2
- fi
- fi
- make_overrun_db() @{
- # Start with a valid database
- cat "$@{vanilla_db@}"
- # Make the final entry really long
- dd if=/dev/zero bs=1 count=1500 2>/dev/null | tr '\000' 'x'
- @}
- ulimit -c 0
- usage() @{ echo "usage: $0 binary [binary...]" >&2; exit $1; @}
- [ $# -eq 0 ] && usage 1
- bad=""
- good=""
- ugly=""
- if dbfile="$(mktemp nasty.XXXXXX)"
- then
- make_overrun_db > "$dbfile"
- for locate ; do
- ver="$locate = $("$locate" --version | head -1)"
- if [ -z "$vanilla_db" ] || "$locate" -d "$vanilla_db" "" >/dev/null ; then
- "$locate" -d "$dbfile" "" >/dev/null
- if [ $? -gt 128 ] ; then
- bad="$bad
- vulnerable: $ver"
- else
- good="$good
- good: $ver"
- fi
- else
- # the regular locate failed
- ugly="$ugly
- buggy, may or may not be vulnerable: $ver"
- fi
- done
- rm -f "$@{dbfile@}" "$@{vanilla_db@}"
- # good: unaffected. bad: affected (vulnerable).
- # ugly: doesn't even work for a normal old-format database.
- echo "$good"
- echo "$bad"
- echo "$ugly"
- else
- exit 1
- fi
- @end verbatim
- VI. VENDOR RESPONSE
- ===================
- The GNU project discovered the problem while 'locate' was being worked
- on; this is the first public announcement of the problem.
- The GNU findutils mantainer has issued a patch as p[art of this
- announcement. The patch appears below.
- A source release of findutils-4.2.31 will be issued on 2007-05-30.
- That release will of course include the patch. The patch will be
- committed to the public CVS repository at the same time. Public
- announcements of the release, including a description of the bug, will
- be made at the same time as the release.
- A release of findutils-4.3.x will follow and will also include the
- patch.
- VII. PATCH
- ==========
- This patch should apply to findutils-4.2.23 and later.
- Findutils-4.2.23 was released almost two years ago.
- @verbatim
- Index: locate/locate.c
- ===================================================================
- RCS file: /cvsroot/findutils/findutils/locate/locate.c,v
- retrieving revision 1.58.2.2
- diff -u -p -r1.58.2.2 locate.c
- --- locate/locate.c 22 Apr 2007 16:57:42 -0000 1.58.2.2
- +++ locate/locate.c 28 May 2007 10:18:16 -0000
- @@@@ -124,9 +124,9 @@@@ extern int errno;
- #include "locatedb.h"
- #include <getline.h>
- -#include "../gnulib/lib/xalloc.h"
- -#include "../gnulib/lib/error.h"
- -#include "../gnulib/lib/human.h"
- +#include "xalloc.h"
- +#include "error.h"
- +#include "human.h"
- #include "dirname.h"
- #include "closeout.h"
- #include "nextelem.h"
- @@@@ -468,10 +468,36 @@@@ visit_justprint_unquoted(struct process_
- return VISIT_CONTINUE;
- @}
- +static void
- +toolong (struct process_data *procdata)
- +@{
- + error (EXIT_FAILURE, 0,
- + _("locate database %s contains a "
- + "filename longer than locate can handle"),
- + procdata->dbfile);
- +@}
- +
- +static void
- +extend (struct process_data *procdata, size_t siz1, size_t siz2)
- +@{
- + /* Figure out if the addition operation is safe before performing it. */
- + if (SIZE_MAX - siz1 < siz2)
- + @{
- + toolong (procdata);
- + @}
- + else if (procdata->pathsize < (siz1+siz2))
- + @{
- + procdata->pathsize = siz1+siz2;
- + procdata->original_filename = x2nrealloc (procdata->original_filename,
- + &procdata->pathsize,
- + 1);
- + @}
- +@}
- +
- static int
- visit_old_format(struct process_data *procdata, void *context)
- @{
- - register char *s;
- + register size_t i;
- (void) context;
- /* Get the offset in the path where this path info starts. */
- @@@@ -479,20 +505,35 @@@@ visit_old_format(struct process_data *pr
- procdata->count += getw (procdata->fp) - LOCATEDB_OLD_OFFSET;
- else
- procdata->count += procdata->c - LOCATEDB_OLD_OFFSET;
- + assert(procdata->count > 0);
- - /* Overlay the old path with the remainder of the new. */
- - for (s = procdata->original_filename + procdata->count;
- + /* Overlay the old path with the remainder of the new. Read
- + * more data until we get to the next filename.
- + */
- + for (i=procdata->count;
- (procdata->c = getc (procdata->fp)) > LOCATEDB_OLD_ESCAPE;)
- - if (procdata->c < 0200)
- - *s++ = procdata->c; /* An ordinary character. */
- - else
- - @{
- - /* Bigram markers have the high bit set. */
- - procdata->c &= 0177;
- - *s++ = procdata->bigram1[procdata->c];
- - *s++ = procdata->bigram2[procdata->c];
- - @}
- - *s-- = '\0';
- + @{
- + if (procdata->c < 0200)
- + @{
- + /* An ordinary character. */
- + extend (procdata, i, 1u);
- + procdata->original_filename[i++] = procdata->c;
- + @}
- + else
- + @{
- + /* Bigram markers have the high bit set. */
- + extend (procdata, i, 2u);
- + procdata->c &= 0177;
- + procdata->original_filename[i++] = procdata->bigram1[procdata->c];
- + procdata->original_filename[i++] = procdata->bigram2[procdata->c];
- + @}
- + @}
- +
- + /* Consider the case where we executed the loop body zero times; we
- + * still need space for the terminating null byte.
- + */
- + extend (procdata, i, 1u);
- + procdata->original_filename[i] = 0;
- procdata->munged_filename = procdata->original_filename;
- @end verbatim
- VIII. THANKS
- ============
- Thanks to Rob Holland <rob@@inversepath.com> and Tavis Ormandy.
- VIII. CVE INFORMATION
- =====================
- No CVE candidate number has yet been assigned for this vulnerability.
- If someone provides one, I will include it in the public announcement
- and change logs.
- @end smallexample
- The original announcement above was sent out with a cleartext PGP
- signature, of course, but that has been omitted from the example.
- Once a fixed release is available, announce the new release using the
- normal channels. Any CVE number assigned for the problem should be
- included in the @file{ChangeLog} and @file{NEWS} entries. See
- @url{https://cve.mitre.org/} for an explanation of CVE numbers.
- @node Making Releases
- @chapter Making Releases
- This section will explain how to make a findutils release. For the
- time being here is a terse description of the main steps:
- @set RELEASE X.Y.Z
- @set RELTAG v@value{RELEASE}
- @enumerate
- @item Commit changes; make sure your working directory has no
- uncommitted changes.
- @item Update translation files; re-run bootstrap to download the
- newest @samp{.po} files.
- @item Make sure compiler warnings would block the release; re-run
- @samp{configure} with the options
- @code{--enable-compiler-warnings --enable-compiler-warnings-are-errors}.
- @item Test; make sure that all changes you have made have tests, and
- that the tests pass.
- Verify this with @code{env RUN_EXPENSIVE_TESTS=yes make distcheck}.
- @c The RUN_EXPENSIVE_TESTS environment variable is checked in init.cfg.
- @item Bugs; make sure all Savannah bug entries fixed in this release
- are marked as fixed in Savannah. Optionally close them too to save
- duplicate work (otherwise, close them after the release is uploaded).
- @item Add new release in Savannah field values; see the @code{Bugs >
- Edit Field Values} menu item. Add a field value for the release you
- are about to make so that users can report bugs in it.
- @item Update version; make sure that the NEWS file
- is updated with the new release number (and checked in).
- @c There is no longer any need to update configure.ac, since it no
- @c longer contains version information.
- @item Tag the release; findutils releases are tagged like this for
- example: v4.5.5. You can create a tag with the a command like this:
- @c we use @example here because @value will not work within @code or @samp.
- @example
- git tag -s -m "Findutils release @value{RELEASE}" @value{RELTAG}
- @end example
- @noindent
- @item Build the release tarball; do this with @code{make distcheck}.
- Copy the tarball somewhere safe.
- @item Merge; if the release (and signed tag) were made on a
- local branch, merge the branch to your local master.
- @item Push; push your master to origin/master.
- @item Push the new release tag; assuming that the name of your remote is
- @samp{origin}, this is:
- @example
- git push origin tag @value{RELTAG}
- @end example
- @item Prepare the upload and upload it.
- You can do this with
- @c we use @example here because @value will not work within @code or @samp.
- @example
- build-aux/gnupload --to ftp.gnu.org:findutils findutils-@value{RELEASE}.tar.xz
- @end example
- @noindent
- Use @code{alpha.gnu.org:findutils} for an alpha or beta release.
- @xref{Automated FTP Uploads, ,Automated FTP
- Uploads, maintain, Information for Maintainers of GNU Software},
- for detailed upload instructions.
- @item Check the FTP upload worked; you can look for an email from the
- robot or check the contents of the actual FTP site.
- @item Make a release announcement; include an extract from the NEWS
- file which explains what's changed. Announcements for test releases
- should just go to @email{bug-findutils@@gnu.org}. Announcements for
- stable releases should go to @email{info-gnu@@gnu.org} as well.
- @item Post-release administrativa: add a new dummy release header in NEWS:
- @code{* Major changes in release ?.?.?, YYYY-MM-DD}
- and update the @code{old_NEWS_hash} in @file{cfg.mk} with
- @code{make update-NEWS-hash}.
- Commit both changes.
- @c make update-NEWS-hash supports make news-check but we normally
- @c don't do that (and I'm not sure that the current NEWS file would
- @c pass the check anyway).
- @item Close bugs; any bugs recorded on Savannah which were fixed in this
- release should now be marked as closed if there were not already.
- Update the @samp{Fixed Release} field of these bugs appropriately and
- make sure the @samp{Assigned to} field is populated.
- @end enumerate
- @node GNU Free Documentation License
- @appendix GNU Free Documentation License
- @include fdl.texi
- @bye
- @comment texi related words used by Emacs' spell checker ispell.el
- @comment LocalWords: texinfo setfilename settitle setchapternewpage
- @comment LocalWords: iftex finalout ifinfo DIR titlepage vskip pt
- @comment LocalWords: filll dir samp dfn noindent xref pxref
- @comment LocalWords: var deffn texi deffnx itemx emph asis
- @comment LocalWords: findex smallexample subsubsection cindex
- @comment LocalWords: dircategory direntry itemize
- @comment other words used by Emacs' spell checker ispell.el
- @comment LocalWords: README fred updatedb xargs Plett Rendell akefile
- @comment LocalWords: args grep Filesystems fo foo fOo wildcards iname
- @comment LocalWords: ipath regex iregex expr fubar regexps
- @comment LocalWords: metacharacters macs sr sc inode lname ilname
- @comment LocalWords: sysdep noleaf ls inum xdev filesystems usr atime
- @comment LocalWords: ctime mtime amin cmin mmin al daystart Sladkey rm
- @comment LocalWords: anewer cnewer bckw rf xtype uname gname uid gid
- @comment LocalWords: nouser nogroup chown chgrp perm ch maxdepth
- @comment LocalWords: mindepth cpio src CD AFS statted stat fstype ufs
- @comment LocalWords: nfs tmp mfs printf fprint dils rw djm Nov lwall
- @comment LocalWords: POSIXLY fls fprintf strftime locale's EDT GMT AP
- @comment LocalWords: EST diff perl backquotes sprintf Falstad Oct cron
- @comment LocalWords: eg vmunix mkdir afs allexec allwrite ARG bigram
- @comment LocalWords: bigrams cd chmod comp crc CVS dbfile eof
- @comment LocalWords: fileserver filesystem fn frcode Ghazi Hnewc iXX
- @comment LocalWords: joeuser Kaveh localpaths localuser LOGNAME
- @comment LocalWords: Meyering mv netpaths netuser nonblank nonblanks
- @comment LocalWords: ois ok Pinard printindex proc procs prunefs
- @comment LocalWords: prunepaths pwd RFS rmadillo rmdir rsh sbins str
- @comment LocalWords: su Timar ubins ug unstripped vf VM Weitzel
- @comment LocalWords: wildcard zlogout basename execdir wholename iwholename
- @comment LocalWords: timestamp timestamps Solaris FreeBSD OpenBSD POSIX
|