Loading content...
You are given a list of directory information entries, where each entry contains a directory path followed by one or more files with their respective contents. Your task is to identify and group all files that have identical content across the entire file system.
Each directory entry in the input follows this structured format:
"directory/path filename1.ext(content1) filename2.ext(content2) ... filenameN.ext(contentN)"This notation indicates that within the directory "directory/path", there exist N files. Each file is represented as filename.ext(content), where the text inside the parentheses denotes the actual content of that file.
Your objective is to analyze all files across all directories and return groups of file paths where each group contains files that share exactly the same content. A valid group must contain at least two files with matching content. The output should list the complete file paths in the format "directory/path/filename.ext".
Important Notes:
"root" or "home")paths = ["root/a 1.txt(abcd) 2.txt(efgh)","root/c 3.txt(abcd)","root/c/d 4.txt(efgh)","root 4.txt(efgh)"][["root/a/1.txt","root/c/3.txt"],["root/4.txt","root/a/2.txt","root/c/d/4.txt"]]Files with content "abcd": root/a/1.txt and root/c/3.txt form one duplicate group. Files with content "efgh": root/a/2.txt, root/c/d/4.txt, and root/4.txt form another duplicate group. These are all the files that share identical content with at least one other file.
paths = ["root/a 1.txt(abcd) 2.txt(efgh)","root/c 3.txt(abcd)","root/c/d 4.txt(efgh)"][["root/a/1.txt","root/c/3.txt"],["root/a/2.txt","root/c/d/4.txt"]]Two duplicate groups are found: one containing files with content "abcd" (root/a/1.txt and root/c/3.txt), and another containing files with content "efgh" (root/a/2.txt and root/c/d/4.txt).
paths = ["root/dir1 file1.txt(content)","root/dir2 file2.txt(content)"][["root/dir1/file1.txt","root/dir2/file2.txt"]]Both files contain the identical content "content", so they form a single duplicate group even though they have different filenames and are located in different directories.
Constraints