Canonical Voices

Posts tagged with 'filesystem'

mandel

Yet again Windows has presented me a challenge when trying to work with its file system, this time in the form of lock files. The Ubuntu One client on linux uses pyinotify to be able to listen to the file system events this, for example, allows the daemon to be updating your files when a new version has been created without the direct intervention of the user.

Although Windows does not have pyinotify (for obvious reasons) a developer that wants to perform such a directory monitoring can rely on the ReadDirectoryChangesW function. This function provides a similar behavior but unfortunately the information it provides is limited when compared with the one from pyinotify. On one hand, there are less events you can listen on Windows (IN_OPEN and IN_CLOSE for example are not present) but it also provides very little information by just giving 5 actions back, that is while on Windows you can listen to:

  • FILE_NOTIFY_CHANGE_FILE_NAME
  • FILE_NOTIFY_CHANGE_DIR_NAME
  • FILE_NOTIFY_CHANGE_ATTRIBUTES
  • FILE_NOTIFY_CHANGE_SIZE
  • FILE_NOTIFY_CHANGE_LAST_WRITE
  • FILE_NOTIFY_CHANGE_LAST_ACCESS
  • FILE_NOTIFY_CHANGE_CREATION
  • FILE_NOTIFY_CHANGE_SECURITY

You will only get back 5 values which are integers that represent the action that was performed. YesterdayI decide to see if it was possible to query the Windows Object Manager to see the currently used FILE HANDLES which would returned the open files. My idea was to write such a function and the pool (ouch!) to find when a file was opened or close. The result of such an attempt is the following:

import os
import struct
 
import winerror
import win32file
import win32con
 
from ctypes import *
from ctypes.wintypes import *
from Queue import Queue
from threading import Thread
from win32api import GetCurrentProcess, OpenProcess, DuplicateHandle
from win32api import error as ApiError
from win32con import (
    FILE_SHARE_READ,
    FILE_SHARE_WRITE,
    OPEN_EXISTING,
    FILE_FLAG_BACKUP_SEMANTICS,
    FILE_NOTIFY_CHANGE_FILE_NAME,
    FILE_NOTIFY_CHANGE_DIR_NAME,
    FILE_NOTIFY_CHANGE_ATTRIBUTES,
    FILE_NOTIFY_CHANGE_SIZE,
    FILE_NOTIFY_CHANGE_LAST_WRITE,
    FILE_NOTIFY_CHANGE_SECURITY,
    DUPLICATE_SAME_ACCESS
)
from win32event import WaitForSingleObject, WAIT_TIMEOUT, WAIT_ABANDONED
from win32event import error as EventError
from win32file import CreateFile, ReadDirectoryChangesW, CloseHandle
from win32file import error as FileError
 
# from ubuntuone.platform.windows.os_helper import LONG_PATH_PREFIX, abspath
 
LONG_PATH_PREFIX = '\\\\?\\'
# constant found in the msdn documentation:
# http://msdn.microsoft.com/en-us/library/ff538834(v=vs.85).aspx
FILE_LIST_DIRECTORY = 0x0001
FILE_NOTIFY_CHANGE_LAST_ACCESS = 0x00000020
FILE_NOTIFY_CHANGE_CREATION = 0x00000040
 
# XXX: the following code is some kind of hack that allows to get the opened
# files in a system. The techinique uses an no documented API from windows nt
# that is internal to MS and might change in the future braking our code :(
UCHAR = c_ubyte
PVOID = c_void_p
 
ntdll = windll.ntdll
 
SystemHandleInformation = 16
STATUS_INFO_LENGTH_MISMATCH = 0xC0000004
STATUS_BUFFER_OVERFLOW = 0x80000005L
STATUS_INVALID_HANDLE = 0xC0000008L
STATUS_BUFFER_TOO_SMALL = 0xC0000023L
STATUS_SUCCESS = 0
 
CURRENT_PROCESS = GetCurrentProcess ()
DEVICE_DRIVES = {}
for d in "abcdefghijklmnopqrstuvwxyz":
    try:
        DEVICE_DRIVES[win32file.QueryDosDevice (d + ":").strip ("\x00").lower ()] = d + ":"
    except FileError, (errno, errctx, errmsg):
        if errno == 2:
            pass
        else:
          raise
 
class x_file_handles(Exception):
    pass
 
def signed_to_unsigned(signed):
    unsigned, = struct.unpack ("L", struct.pack ("l", signed))
    return unsigned
 
class SYSTEM_HANDLE_TABLE_ENTRY_INFO(Structure):
    """Represent the SYSTEM_HANDLE_TABLE_ENTRY_INFO on ntdll."""
    _fields_ = [
        ("UniqueProcessId", USHORT),
        ("CreatorBackTraceIndex", USHORT),
        ("ObjectTypeIndex", UCHAR),
        ("HandleAttributes", UCHAR),
        ("HandleValue", USHORT),
        ("Object", PVOID),
        ("GrantedAccess", ULONG),
    ]
 
class SYSTEM_HANDLE_INFORMATION(Structure):
    """Represent the SYSTEM_HANDLE_INFORMATION on ntdll."""
    _fields_ = [
        ("NumberOfHandles", ULONG),
        ("Handles", SYSTEM_HANDLE_TABLE_ENTRY_INFO * 1),
    ]
 
class LSA_UNICODE_STRING(Structure):
    """Represent the LSA_UNICODE_STRING on ntdll."""
    _fields_ = [
        ("Length", USHORT),
        ("MaximumLength", USHORT),
        ("Buffer", LPWSTR),
    ]
 
class PUBLIC_OBJECT_TYPE_INFORMATION(Structure):
    """Represent the PUBLIC_OBJECT_TYPE_INFORMATION on ntdll."""
    _fields_ = [
        ("Name", LSA_UNICODE_STRING),
        ("Reserved", ULONG * 22),
    ]
 
class OBJECT_NAME_INFORMATION (Structure):
    """Represent the OBJECT_NAME_INFORMATION on ntdll."""
    _fields_ = [
        ("Name", LSA_UNICODE_STRING),
    ]
 
class IO_STATUS_BLOCK_UNION (Union):
    """Represent the IO_STATUS_BLOCK_UNION on ntdll."""
    _fields_ = [
        ("Status", LONG),
        ("Pointer", PVOID),
    ]
 
class IO_STATUS_BLOCK (Structure):
    """Represent the IO_STATUS_BLOCK on ntdll."""
    _anonymous_ = ("u",)
    _fields_ = [
        ("u", IO_STATUS_BLOCK_UNION),
        ("Information", POINTER (ULONG)),
    ]
 
class FILE_NAME_INFORMATION (Structure):
    """Represent the on FILE_NAME_INFORMATION ntdll."""
    filename_size = 4096
    _fields_ = [
        ("FilenameLength", ULONG),
        ("FileName", WCHAR * filename_size),
    ]
 
def get_handles():
    """Return all the processes handles in the system atm."""
    system_handle_information = SYSTEM_HANDLE_INFORMATION()
    size = DWORD (sizeof (system_handle_information))
    while True:
        result = ntdll.NtQuerySystemInformation(
            SystemHandleInformation,
            byref(system_handle_information),
            size,
            byref(size)
        )
        result = signed_to_unsigned(result)
        if result == STATUS_SUCCESS:
            break
        elif result == STATUS_INFO_LENGTH_MISMATCH:
            size = DWORD(size.value * 4)
            resize(system_handle_information, size.value)
        else:
            raise x_file_handles("NtQuerySystemInformation", hex(result))
 
    pHandles = cast(
        system_handle_information.Handles,
        POINTER(SYSTEM_HANDLE_TABLE_ENTRY_INFO * \
                system_handle_information.NumberOfHandles)
    )
    for handle in pHandles.contents:
        yield handle.UniqueProcessId, handle.HandleValue
 
def get_process_handle (pid, handle):
    """Get a handle for the process with the given pid."""
    try:
        hProcess = OpenProcess(win32con.PROCESS_DUP_HANDLE, 0, pid)
        return DuplicateHandle(hProcess, handle, CURRENT_PROCESS,
            0, 0, DUPLICATE_SAME_ACCESS)
    except ApiError,(errno, errctx, errmsg):
        if errno in (
              winerror.ERROR_ACCESS_DENIED,
              winerror.ERROR_INVALID_PARAMETER,
              winerror.ERROR_INVALID_HANDLE,
              winerror.ERROR_NOT_SUPPORTED
        ):
            return None
        else:
            raise
 
 
def get_type_info (handle):
    """Get the handle type information."""
    public_object_type_information = PUBLIC_OBJECT_TYPE_INFORMATION()
    size = DWORD(sizeof(public_object_type_information))
    while True:
        result = signed_to_unsigned(
            ntdll.NtQueryObject(
                handle, 2, byref(public_object_type_information), size, None))
        if result == STATUS_SUCCESS:
            return public_object_type_information.Name.Buffer
        elif result == STATUS_INFO_LENGTH_MISMATCH:
            size = DWORD(size.value * 4)
            resize(public_object_type_information, size.value)
        elif result == STATUS_INVALID_HANDLE:
            return None
        else:
            raise x_file_handles("NtQueryObject.2", hex (result))
 
 
def get_name_info (handle):
    """Get the handle name information."""
    object_name_information = OBJECT_NAME_INFORMATION()
    size = DWORD(sizeof(object_name_information))
    while True:
        result = signed_to_unsigned(
            ntdll.NtQueryObject(handle, 1, byref (object_name_information),
            size, None))
        if result == STATUS_SUCCESS:
            return object_name_information.Name.Buffer
        elif result in (STATUS_BUFFER_OVERFLOW, STATUS_BUFFER_TOO_SMALL,
                        STATUS_INFO_LENGTH_MISMATCH):
            size = DWORD(size.value * 4)
            resize (object_name_information, size.value)
        else:
            return None
 
 
def filepath_from_devicepath (devicepath):
    """Return a file path from a device path."""
    if devicepath is None:
        return None
    devicepath = devicepath.lower()
    for device, drive in DEVICE_DRIVES.items():
        if devicepath.startswith(device):
            return drive + devicepath[len(device):]
    else:
        return devicepath
 
def get_real_path(path):
    """Return the real path avoiding issues with the Library a in Windows 7"""
    assert os.path.isdir(path)
    handle = CreateFile(
        path,
        FILE_LIST_DIRECTORY,
        FILE_SHARE_READ | FILE_SHARE_WRITE,
        None,
        OPEN_EXISTING,
        FILE_FLAG_BACKUP_SEMANTICS,
        None
    )
    name = get_name_info(int(handle))
    CloseHandle(handle)
    return filepath_from_devicepath(name)
 
def get_open_file_handles():
    """Return all the open file handles."""
    print 'get_open_file_handles'
    result = set()
    this_pid = os.getpid()
    for pid, handle in get_handles():
        if pid == this_pid:
            continue
        duplicate = get_process_handle(pid, handle)
        if duplicate is None:
            continue
        else:
            # get the type info and name info of the handle
            type = get_type_info(handle)
            name = get_name_info(handle)
            # add the handle to the result only if it is a file
            if type and type == 'File':
                # the name info represents the path to the object,
                # we need to convert it to a file path and then
                # test that it does exist
                if name:
                    file_path = filepath_from_devicepath(name)
                    if os.path.exists(file_path):
                        result.add(file_path)
    return result
 
def get_open_file_handles_under_directory(directory):
    """get the open files under a directory."""
    result = set()
    all_handles = get_open_file_handles()
    # to avoid issues with Libraries on Windows 7 and later, we will
    # have to get the real path
    directory = get_real_path(os.path.abspath(directory))
    print 'Dir ' + directory
    if not directory.endswith(os.path.sep):
        directory += os.path.sep
    for file in all_handles:
        print 'Current file ' + file
        if directory in file:
            result.add(file)
    return result

The above code uses undocumented functions from the ntdll which I supposed Microsoft does not want me to use. An while it works, the solution does no scale since the process of querying the Object Manager is vey expensive and can rocket your CPU if performed several times. Nevertheless the above code works correctly and could be used to write a tools similar to those written by sysinternals.

I hope someone will find a use for the code, in my case it is code that I’ll have to throw away :(

Read more
mandel

As most of you know, the Windows files system does not support a number of special characters. To be precise does characters are

:”/\\|?*. As you can imaging this is a problem when syncing between far superior Unix system and Windows. Knowing this, can you please let me know what is wrong/right in this image:

Got it? Lets look closer:

Well, the genius behind this was not me but it was Chipaca, I can tell you, I’m far less imaginative. But this little trick will allow you to sync between Windows and Ubuntu in a far better user friendly way that other sync services do :)

Read more