How can I disable archive indexing and delete its contents from the tables?

Saleh Samara
2023-02-26 20:14

The indexed_archives_entries database table is an index of files contained within archive files, whose content is searchable by conducting an Archive Search (formerly Class Search). When a new archive file is deployed to Artifactory, its contents are indexed and the table is updated. These archive files can be .jar, .war, .zip, or other archive types, as defined in the file:

${ARTIFACTORY_HOME}/etc/mimetypes.xml

Entries are deleted from this table whenever an indexed artifact is deleted by Artifactory's garbage collection. Therefore, this table will not hold indexes of non-existing artifacts.
We’ve noticed that in many cases, the archive indexing related table can occupy about 40% of DB storage as well as increase DB CPU usage during related operations such as deleting an artifact. Thus, we recommend disabling this feature in case this feature is not in use.
To verify if this feature is being used in your organization, you can examine the Artifactory request log and search for requests to the following paths:

/api/search/archive
/ui/api/v1/ui/artifactsearch/class

These will indicate this feature is being used, and what is the origin of the request.

To disable archive indexing for a specific file type, use the following instruction.
Disabling the Archive Indexing

Artifactory 7.x

Usethis instructionin this article to disable the feature from the Admin Setting UI.

Artifactory 6.x

Edit the mimetypes.xml and change the value of the index attribute of the file type from true to false. Disabling future indexing of a specific mimetype will not delete existing indexes from the DB.

Cleaning the tables unused after disabling the Archive Indexing feature (both versions)

The following are steps to delete indexes from your DB and reindex only the necessary mimetypes. Note: As these instructions will result in deletions from your database, avoid the possibility of causing corruptions by taking great care to execute these steps properly:
1. Disable the Archive Indexing feature using the steps in the section above
2. Shutdown Artifactory
3. Truncate all contents of archive related tables using the following SQL query:

TRUNCATE indexed_archives_entries;
TRUNCATE indexed_archives CASCADE;
TRUNCATE archive_names CASCADE;
TRUNCATE archive_paths CASCADE;

4. Start Artifactory

Optional: Restoring Archive Indexing Data
If you'd like to restore archive indexing data after modifying the mimetypes.xml and cleaning the above tables, you can run the following REST query to calculate your Archives Index:

curl -X POST -uadmin:password https://{server_name}:{port_number}/artifactory/api/archiveIndex/*

You can specify a single repository key or use an asterisk (*) to trigger calculations for all local and cache repositories.

注意:这个查询需要一个管理员用户。Furthermore, this process can be resource intensive. Accordingly, we recommend initiating it when your server(s) is not heavily loaded. We also recommend that you reindex one repository at a time, so as to limit the amount of stress you'll be placing on your system.