As an experienced Linux system administrator, I often need to inspect the contents of software packages. Ubuntu and Debian packages utilize the flexible .deb format – essentially compressed archives of all files needed to install an application. When managing complex production systems, it‘s crucial to understand precisely what gets extracted or overwritten during package upgrades. Moreover, identifying dependencies and shared libraries helps troubleshoot tricky conflicts between installed software. This comprehensive guide will break down several methods for examining and verifying the files within .deb packages in Ubuntu Linux.

Fundamentals of the Debian Packaging System

Before diving into querying package contents, let‘s briefly discuss some key concepts of Debian‘s packaging infrastructure:

  • Repositories – Central software archives containing available .deb package files, metadata, dependencies, etc. Ubuntu draws packages from the main Ubuntu repositories as well as supplementary PPAs.

  • Package Manager – Frontend tools like apt or graphical App Stores that retrieve packages from repositories and orchestrate installation. The key underlying manager is dpkg which unpacks archives and sets up packages.

  • Package Metadata – Each .deb has control data like package name, version, maintainer, dependencies, description etc. This metadata and package contents are stored in the local package database.

  • Shared Libraries – Reusable compiled code loaded at application runtime, rather than including statically in each executable. The use of dynamic shared libs over static linking is preferred.

So in essence, Ubuntu‘s packaging apparatus fetches .deb archives from repositories, extracts metadata to update the local database, installs included files in designated locations, sets up symlinks, and registers shared library dependencies. Understanding this infrastructure helps optimize querying and verifying package contents across systems.

Website Search in Ubuntu Package Explorer

The Ubuntu Packages Search portal allows browsing and exploring package contents via any web browser. This official webapp queries the Ubuntu repositories directly without needing local package files or a Linux installation. Follow these simple steps:

  1. Navigate to https://packages.ubuntu.com
  2. Scroll down and enter the desired package name in the ‘Search Package List‘ section
  3. For accurate results, check ‘Show exact matches only‘
  4. Leave the default Ubuntu release selected or pick a specific version
  5. Hit ‘Search‘
  6. On the results page, choose your package details and architecture
  7. Click ‘List of files‘ to view contents extracted during installation

Additionally you can search for specific filenames rather than just packages. This lookups which packages contain that file across all repositories.

Here are some examples illustrating the Ubuntu Package Search in action:

Querying Contents of the Nginx Web Server Package

  1. Search for ‘nginx‘
  2. Open latest version for amd64 architecture
  3. Review complete file list extracted upon installation

This shows all configuration files, systemd service scripts, binary executables and other files set up by the package.

Checking Which Package Provides a Shared Library

  1. Enter ‘libpython3.8.so‘ in file search
  2. Notice python3.8-minimal package contains that library

This quickly verifies that the shared library gets shipped as part of Python itself.

The Ubuntu Package Search site provides the simplest way to check any package contents. The drawback is needing to manually lookup each package individually rather than bulk terminal queries. Let‘s move on to more advanced package content analysis methods.

Leveraging dpkg for Low-Level Package Inspection

The dpkg command line program installed in all Debian-based systems forms the core of package management. It takes care of unpacking archives, configuring packages, registering metadata and setting up system files. While package managers like apt utilize dpkg in the backend, running dpkg directly gives precise control when querying package contents:

List All Files Installed by a Package

dpkg -L nginx

Inspect Contents of a .deb File

dpkg-deb -c downloads/package.deb

Check Files in an Uninstalled Package

dpkg --contents package_version.deb

Note dpkg works reliably even without having the actual .deb file for a package version. It simply references the package metadata rather than directly examining the archive contents.

A lesser known but extremely useful feature is searching all packages for a specific file:

dpkg -S filename

For instance to check which package contains the Python shared library:

$ dpkg -S libpython3.8.so
python3.8-minimal: /usr/lib/x86_64-linux-gnu/libpython3.8.so

This quickly indicates that python3.8-minimal owns that shared library – much faster than web searching all packages manually.

In summary, leveraging dpkg provides precise control over querying package contents across entire Ubuntu systems. The main limitations are needing familiarity with terminal usage and handling overwhelming output for large packages. Next let‘s look at adding a search index for easier filesystem lookups.

Enhancing Queries with apt-file Search Index

The apt-file utility improves package content queries by building a searchable index mapped to specific files and packages. This supplements dpkg with file-centric searches without having to guess relevant packages. After a one-time setup, the system-wide index enhances exploring package composition and identifying file ownership.

Install and Update Index

sudo apt update
sudo apt install apt-file 
sudo apt-file update

Check All Contents of a Package

apt-file list postgresql

Search for a File to Identify Parent Package

apt-file search /usr/bin/python3

Lookup Which Package Contains a Config File

apt-file find /etc/nginx/nginx.conf  

Now you can instantly search for files or directories to determine packages that include them among all indexed packages. The initial index update may take time depending on system size, but enables very quick subsequent searches.

For example, identifying configuration file ownership reveals insights about package relationships:

$ apt-file find /etc/default/nginx
nginx-common: /etc/default/nginx

This shows the nginx-common package owns the default config file used by all nginx packages.

In summary, apt-file vastly simplifies exploratory package searches across the Linux filesystem based on contents. It serves as a lookup index mapping files to owning packages. Pairing apt-file search capabilities with dpkg‘s inspection commands deliver a powerful one-two punch for analyzing package composition.

Checking Installation Footprint of Packages

While examining contents provides finer package details, there are cases where we just need size estimates before installing large packages. Methods to quickly retrieve package disk footprint without downloading the full .deb archives include:

Check Size of an Installed Package

dpkg-query -W nginx

Sample truncated output:

...
Installed-Size: 5185
...

This gives output in kilobytes for the entire installation footprint including all extracted files, directories, symlinks etc.

Inspect Size Details of a .deb File

dpkg-deb -I python3_3.8.2-1.deb

Sample truncated output:

Installed-Size: 81786
Size: 321428
...

Now get both installed and compressed .deb size even without the package present locally.

View Summary of an Available Package

apt show gimp

Sample truncated output:

...  
Installed-Size: 69.1 MB
...

So apt retrieves size and other metadata for packages in repositories without separate tools.

In summary, Debian packaging tools provide multiple options to determine installation sizes ranging from granular file listings to summary overviews. Intelligently pairing size checks with content queries enables informed decisions for addressing downstream storage constraints.

Deep Diving with dpkg File Classifications

Now that we have covered several techniques to query package contents, an advanced concept is deciphering how dpkg classifies package files during installation. This provides insight into expected file layouts when compiling custom .deb packages using checkinstall and similar tools.

File Designations

The main package classifications assigned by dpkg:

  • Config files – Local configuration customized on installation
  • Conffiles – Configs possibly modified post-install
  • Preinst/Postinst – Scripts running before/after install
  • Prerm/Postrm – Pre/post remove script
  • Symbols – Interface details used by debuggers
  • Triggers – Handles triggering events on other packages

View Designation Details

Pass -D to a dpkg inspection command:

dpkg -D --contents package.deb

Sample truncated output:

drwxr-xr-x root/root         0 2021-08-22 12:45 ./
-rw-r--r-- root/root       148 2021-08-22 12:45 ./postinst
-rw-r--r-- root/root      6148 2021-08-22 12:45 ./triggers
...

This appends the designation details along with permissions, sizes and other file metadata.

Let‘s break down the configuration specifics exposed:

  • postinst – Script that runs after package installation
  • triggers – Triggers events in other packages

Similarly all extracted config files would be classified with details like:

conffiles  /etc/package/config.conf

So dpkg‘s classifications provide deeper understanding of how package maintainers expect installations to integrate with the base system.

Tracing Shared Library Dependencies

Earlier we discussed how shared dynamic libraries reduce duplication across Linux systems. The flip side is dealing with complex dependency chains that emerge. Fortunately, Debian packages explicitly define shared library relationships that can get visualized with handy tools.

Inspect Declared Shared Library Dependencies

dpkg -s python3.8 

Sample truncated output:

Depends: libpython3.8 (= 3.8.2-0ubuntu2), libexpat1 (>= 2.2), ...  

This reveals the direct shared library dependencies for a package explicitly set by the package maintainer. We can recursively expand chains using apt.

Trace Full Dependency Chain

apt-cache depends python3.8  

Sample truncated output:

python3.8
  Depends: libpython3.8
    Depends: libc6
      Depends: libgcc-s1
    Depends: libreadline8   
...

Now all second level indirect dependencies get recursively listed.

Visualize Complex Chains as a Directed Graph

apt-cache dotty python3.8

python-dependencies-graph

This generates a graph diagram encoded in Dot language processed by Graphviz software. The intersections and cascades reveal the tangled web.

In summary, Debian packages explicitly define direct shared library dependencies that tools can recursively expand and visualize. This clarifies the cracks packages can fall through during updates and why system stability improves when only pulling in required libraries.

Identifying Package Sources Across Repositories

So far we have examined package contents assuming packages originate from the main Ubuntu repositories. However, production environments often utilize supplementary PPAs (Personal Package Archives) for Obtaining the latest versions of key software like Python data science stacks.

Let‘s discuss strategies for pinpointing package sources spanning both official and third-party repositories:

Check the Configured APT Sources

cat /etc/apt/sources.list

This exposes all repository sources configured on the system. Output with only Ubuntu defaults would show:

deb http://archive.ubuntu.com/ubuntu/ focal main restricted
deb http://security.ubuntu.com/ubuntu/ focal-security restricted main
...

Any additional PPAs appended would indicate third-party packages installed from those external archives.

Identify Source of an Installed Package

apt-cache policy postgresql

Sample truncated output:

Installed: 11.10-0ubuntu0.20.04.1
Candidate: 11.10-0ubuntu0.20.04.1
Version table:
*** 11.10-0ubuntu0.20.04.1 500
        500 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages        
        100 /var/lib/dpkg/status
     11.10-0ubuntu0.20.04 500
        500 http://security.ubuntu.com/ubuntu focal-security/main amd64
...             

This displays the candidate version available from each configured repository with installed version on top.

In this output we can pinpoint the installed PostgreSQL 11 package comes from Ubuntu 20.04 LTS focal-updates repository.

Identify All Non-Ubuntu Packages

apt-mark showmanual 

This prints all packages installed directly rather than from APT:

libboost-all-dev/focal-updates,now 1.71.0.0ubuntu2 amd64 [installed]
libssl-dev/focal-updates 1.1.1f-1ubuntu2.12 amd64 [installed] 
python3.8/focal-updates,now 3.8.5-1~20.04.2 amd64 [installed,automatic]

Here we notice Python 3.8 pulled from Ubuntu updates, but other system libs explicitly setup. administrator action rather than the standard Ubuntu repositories.

In summary, combining multiple tools exposes the full spectrum of package sources spanning both official and third-party archives. This auditability trail helps troubleshoot unexpected changes and identify unnecessary supplementary channels.

Retracing Package Installation History

Lastly, a key technique for diagnosing package issues is analyzing histories to pinpoint updates that triggered breaking changes. Debian logs details like dpkg transactions, command output, and changelogs to help retrace events. Common historical signals revealing disruptive packages:

Cross-reference /var/log/dpkg.log during outages

Look for unusual package removals or failures around degradation timestamps.

Audit /var/log/apt/history.log

Scan for irregular packages from new sources introduced before incidents.

Examine /var/log/apt/term.log

Review command output from package manager runs over time.

Check changelogs for updated packages

Verify if patched vulnerabilities or major version upgrades occurred.

While simple in principle, manually piecing together timeline dots becomes tedious at scale. This is where automation and log analytics systems help identify most relevant subsets of historical signals. The takeaway is Debian‘s instrumentation provides a time machine to rewind installation events when trouble arises.

Conclusion

In closing, detailed visibility into Ubuntu package contents, sizes, dependencies, installations and histories is vital for effectively administering large-scale Linux deployments. We covered a diverse toolbox enabling both high-level package summaries as well as low-level file inspection. The packaging ecosystem provides rich interfaces and metadata to power use cases from porting legacy software to repackaging proprietary apps. Hopefully this guide has demystified the inner workings of .deb packages while revealing best practices for querying package details using the flexible dpkg toolkit.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *