Example: image file viewer program

Let us consider a classic example: a program used for opening and viewing raster image files. The image can be zoomed and scrolled (if it is bigger than the screen).

In classic operating systems, this task is usually performed by a closed-architecture application. Such applications either contain all the required code or use third party components. The key point is that it is impossible to use a part of this application. You can run it as-is or you use another solution.

To understand how this task is solved on the Sivelkiria OS, let us take a detailed look at this program and understand what concepts it operates. Below is the list of entities involved in the program’s operation.

  1. Image location. In the classic case, it is defined by a file path. From a wider perspective, this category may also include local network or internet addresses. However, there are more options: the image can be located in RAM, on another program’s output (e.g. image manipulation or graphic design software), on a web page, inside a chat message or e-mail, inside an archive or an office document. Despite the fact that these options differ technically and are processed in different ways, from the user’s point of view they are all just ways of locating the image. Creating an interface capable of choosing between all these options is not an easy task; however, it is conceptually possible. Moreover, there is no reason to limit the user’s ability to view an image present at any of the locations listed above.
  2. A sequence of bytes representing the stored image. Again, the way to access it will be determined by the storage method, but from the point of view of algorithms that read these bytes in order to display the image on the screen, the difference is unlikely to be significant.
  3. Image file format. Popular formats include jpeg and gif. The actual format is determined by the file’s content: if a jpeg file is mistakenly saved with a .gif extension, any attempt to interpret it as gif or to cease processing it would be shortsighted. At the same time, the clues about the intended format can be discovered by analyzing the file extension, the web server’s response headers or other available data.
  4. The codec containing the algorithms necessary to extract the bitmap and metadata from the raw byte format.
  5. Representation of an image file contents as a single or multiple pages and representation of a page as a single or multiple frames. Some formats, such as tiff, allow the file to contain multiple pages. Others, e.g. gif, support frame-by-frame animation.
  6. Full image size.
  7. Image color parameters: color model, color depth, palettes, transparency support.
  8. A bitmap representing the whole image or its part.
  9. The geometry (size and offset) of the image area to be displayed on the screen.
  10.  A window (or GUI of some other kind) which displays the image on the screen.

In the Sivelkiria OS, every entity mentioned above is described by a specific data interface (an API). Such representation allows using them in many contexts outside of the initial model (this will be elaborated on further in this document).

Each interface is implemented by an object that is created by a module. The object’s method code is executed in the context of the module that generated it. However, it would be incorrect to talk about running modules as separate processes, since they have neither associated threads nor data outside of the objects created. If some state is to be stored and used when processing more than one image, then it is encapsulated into a dedicated object which is then accessed using the common rules via the OS’s object API.

The structure of modules involved in this program’s operation may look like this:

  1. Module one defines the behavior of the object that implements the ‘Object Location’ interface (the image being a special case of an object). In particular, it defines the way to access bytes in the given location. When doing so, it can use the modules that ensure disk access or network support, depending on the physical location of the object.
  2. Module two provides direct access to the image’s bytes. It also gives some clues about the possible content type based on the storage method (file extension, web response headers, etc.).
  3. Module three uses these clues and accesses the image’s byte representation to deduce the actual contents type (file format).
  4. Module four implements the codec which allows extraction of service information such as size, color, page and frame layout and graphic information (bitmaps) from the raw byte representation. Both full and partial information can be extracted: for example, the calling context may require a specific frame or page. Another example of extracting partial information is previewing a thumbnail in low resolution: for the jpeg format, full file reading may not be required.
  5. Module five receives the data regarding the geometry of the full image, the bitmap of the image, as well the information about the fragment to display and the scale of this fragment. Based on this data, it builds a bitmap that should be displayed on the screen, and passes it to the context that called it.
  6. Finally, the top-level module implements the GUI. It is responsible for rendering the window, displaying the resulting bitmap and reacting to user actions. Some of these actions may require accessing other modules — for example, if the user initiates opening or saving a file, it will run the module responsible for viewing the location of files on the disk, which in turn will return an interface describing the location of the file selected by the user.
  7. Of course, the program can use more modules to implement some additional behaviors, such as logging, rendering window components, sound feedback, etc.

It is easy to see that this structure makes reusing of functionality as simple as possible. Thus, after installing a new codec in the system, this image type will be automatically supported by all programs in all contexts. Different modules can be written by different developers using different languages, but these differences are insignificant in this model of module interaction. The same codec can be used for loading images on the screen in the described program interface, or for rendering them in a messaging program, web browser, directory browser, and so on.

Support for new use cases can be added easily. For example, adding additional interfaces allows you to support additional actions like switching to the next/previous images (regardless of their location and access method) or applying filters. Introduction of thin client support would not cause any problems either: the operating system controls data and calls passing between the modules, so resource-demanding operations such as image decoding or scaling can be easily moved to a different host.

Since the prototypes of all the modules described above are known to the operating system, it is aware of their needs from a system point of view and can act accordingly. For example, when working with large images, the image scaling operation can be performed by a dedicated thread to avoid interface blockage on low-powered computers. Moreover, since modules do not know anything about the threads they are executed by, additional optimizations are possible. For example, a GUI thread can be separated from the thread responsible for calculations, (such as image scaling,) if the latter fails to complete the task in a predetermined time interval. Another optimization could involve starting a new calculation thread even before the old one handles the signal to abort the operation which is no longer required (e.g. if the scale was changed again even before the image was rendered). The fact that the operating system has information about all threads, workload and CPUs’ capabilities, makes a room for these optimizations to be more effective than the ones done by an application developer based on some assumptions regarding the environment and conditions.

There are several ways to solve the problem of how modules that jointly provide a solution to an application problem can work together. For example, it is obvious that the choice of a module that reads bytes from the disk is determined by the file system of a volume, and the same module will handle all requests to the given volume. The module responsible for defining the image format will most likely be installed at the system level and used in all contexts. An exception can be made for situations where the module fails: in this case, the operating system can search for another module implementing the same prototype, and if there is one, try using it with the same input data. Thus, the issue can be resolved without user intervention. The data regarding the module’s failure could be collected automatically and, if allowed by the security policies, sent to the developers of the faulty module along with the required debug information. If problems occur too often in a module, the operating system can exclude it from the lookup chain or lower its priority in the queue.

In other cases, module selection can be done by the user and stored as part of system configuration. For example, different scaling modules can provide different rendering styles (anti-aliasing parameters when zooming out or blurring pixel borders when zooming in). Depending on the context, the user may need a different approach (sharp pixel borders for precise positioning, or blurry ones for visual comfort).

The ways to launch the module could also differ. For example, the module responsible for rendering the image viewer’s window can be called by the module that implements the virtual desktop behavior (when the user clicks on the file), or by the module that represents the program launch menu. After being launched, this module loads other modules that are required to perform the current task. This description only demonstrates that the proposed interaction scheme can be implemented. It cannot be considered as a full instruction for writing the image viewer program, the operating system, and / or its modules.