Publications by Jonas S Karlsson

Index

Articles are listed in reverse order of appearance.

Type of publicationTitle
Soon... PhD Thesis
Later Distributed OMEGA-storage: Scalable High Performance Distributed Storage
In-the-pipe Live Optimization in SDDS Join Operations
Paper OMEGA-storage: A Self Indexing Multi-Attribute Storage for Very Large Main Memories
Tech Report OMEGA-storage: A Self Indexing Multi-Attribute Storage for Very Large Main Memories
Paper hQT*: A Scalable Distributed Data Structure for High-Performance Spatial Access
Paper Transparent Distribution in a Storage Manager
Tech Report Scalable Storage for a DBMS Using Transparent Distribution
Internal report Synergy effects when using Linear Hashing (LH) Locally for Distributed Linear Hashing (LH*) on a Massive Parallel Machine (Parsytec GC)
Lic Thesis A Scalable Data Structure for a Parallel Data Server
Paper LH*LH : A Scalable High Performance Data Strucuture for Switched Multicomputers
Presentation of project at "Ramkonferense" spAMOS: Scalable Parallel AMOS using LH*
Tech Report LH*LH : A Scalable High Performance Data Strucuture for Switched Multicomputers
Unpublished Light Weight Thread Implementation and Integration into AMOS DBMS
Internal Paper Implementing C-Portable Light Weight Threads
CAELAB Internal Document AMOS Programmer's Hackbook - a Guideline to the Source
CAELAB Memo AMOS.v1 User's Guide
M Sc Thesis An Implementation of Transaction Logging and Recovery in a Main Memory Resident Database System

Titles/Abstracts/Postscripts


Jonas S Karlsson, Martin L. Kersten
OMEGA-storage: A Self Indexing Multi-Attribute Storage for Very Large Main Memories
To be presented at the Australian Database Conference Canberra, Australia, January

Abstract: Main memory storage is continuously improving, both in its price and its capacity. With this comes new storage problems and new directions of possible usage. Just before the millennium, several main memory database systems are becoming commercially available. The hot areas for their deployment include boosting the performance of web-enabled systems, such as search-engines, and electronic auctioning systems. We present a novel data storage structure -- the Omega-storage structure, a high performance data structure, to index very large amounts of multi-attribute data. The experiments show excellent performance for point retrieval, and highly efficient pruning for pattern searches. It provides the balanced storage previously achieved by random kd-trees, but avoids their increased pattern match search times, by an effective assignment bits of attributes to index. Moreover, it avoids the sensitivity of the kd-tree to insert orders.


Jonas S Karlsson
hQT*: A Scalable Distributed Data Structure for High-Performance Spatial Access
Was presented at International Conference on Foundations of Data Organization pp 37-46, Kobe, Japan, November

Abstract: Spatial data storage stresses the capability of conventional DBMSs. We present a scalable distributed data structure, {\hQTs}, which offers support for efficient spatial point and range queries using order preserving hashing. It is designed to deal with skewed data and extends results obtained with scalable distributed hash files, LH*, and other hashing schemas. Performance analysis shows that an hQT* file is a viable schema for distributed data access, and in contrast to traditional quad-trees it avoids long traversals of hierarchical structures. Furthermore, the novel data structure is a complete design addressing both scalable data storage and local server storage management as well as management clients addressing. We investigate several different client updating schemes, enabling better access load distribution for many ``slow'' clients.


J. S Karlsson, M. L. Kersten
Transparent Distribution in a Storage Manager
Was presented at Internatial Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, USA, July 1998.

Abstract: Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into Monet, our efficient and extensible DBMS. We show that this merge provides high performance processing and scalable storage of very large sets of distributed data. In our implementation we extended the Monet language interpreters operators in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. Performance measures show viability of our approach, querying using a number of operators on distributed data on a number of nodes.


J. S Karlsson, M. L. Kersten
Scalable Storage for a DBMS Using Transparent Distribution
Technical Report INS-R9710, CWI, Amsterdam, The Netherlands, 1997.

Abstract: Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into our efficient and extensible DBMS, called Monet\footnote{See http://www.cwi.nl/${\sim}$monet}. We show that this merge permits processing very large sets of distributed data. In our implementation we extended the relational algebra interpreter in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. The on-the-fly optimization of operations --- heavily used in Monet --- to deploy different strategies and scenarios inside the primary operators associated with an SDDS adds self-adaptiveness to the query system; it dynamically adopts itself to unforeseen situations. We illustrate the performance efficiency by experiments on a network of workstations. The transparent integration of SDDSs opens new perspectives for very large self-managing database systems.


Jonas S Karlsson (1997).
A Scalable Data Structure for a Parallel Data Server .
Thesis No 609 by Jonas S Karlsson, 1997
A Licentiate Thesis is a simpler form of PhD Thesis made about 3 years after the MSc, availiable only in Sweden. A full PhD these takes 2-3 more years.

Abstract: In this thesis we identify the importance of appropriate data structures for parallel data servers. We focus on Scalable Distributed Data Structures for this purpose. In particular LH*, and the new data structure LH*lh. An overview is given of related work and systems that have traditionally implicated the need for such data structures. We begin by discussing high-performance databases, and this leads us to database machines and parallel data servers. We sketch an architecture for an LH*lh-based file storage that we plan to use for a parallell data server. We also show performance measures for the LH*lh and present its algorithm in detail. The testbed, the Parsytec switched multicomputer, is described along with experience acquired during the implementation process. Parts of the thesis are based on the article on LH*lh published in the lecture notes from the 5th International Conference on Extending Database Technology, in Avignon, France 1996.


Karlsson, J. S., Litwin, W., and Risch, T. (1995).
LH*LH : A Scalable High Performance Data Strucuture for Switched Multicomputers .
Technical Report LiTH-IDA-R-95-25, Department of Computer and Information Science, Linköping University, Sweden. Has been accepted to EDBT-96, Avinon, France.

Abstract: LH*LH is a new data structure for scalable high-performance hash files on the increasingly popular switched multicomputers, i.e., MIMD multiprocessor machines with distributed RAM memory and without shared memory. An LH*LH file scales up gracefully over available processors and the distributed memory, easily reaching Gbytes. Address calculus does not require any centralized component that could lead to a hot- spot. Access times to the file can be under a millisecond and the file can be used in parallel by several client processors. We show the LH*LH design, and report on the performance analysis. This includes experiments on the Parsytec GC/PowerPlus multicomputer with up to 128 Power PCs and 32 MB of distributed RAM per node. We prove the efficiency of the method and justify various algorithmic choices that were made. LH*LH opens a new perspective for high-performance applications, especially for the database management of new types of data and in real-time environments.


Karlsson, J. S. (1995).
An Implementation of Transaction Logging and Recovery in a Main Memory Resident Database System .
Master Thesis LiTH-IDA-Ex-94-04, Department of Computer and Information Science, Linköping University, Sweden.

Abstract: This report describes an implementation of Transaction Logging and Recovery using Unix Copy-On-Write on spawned processes. The purpose of the work is to extend WS-Iris, a research project on Object Oriented Main Memory Databases, with functionality for failure recovery.

The presented work is a Master Thesis for a student of Master of Science in Computer Science and Tech nology. The work has been commissioned by Tore Risch, Professor of Engineering Databases at Computer Aided Engineering laboratory (CAElab), Linköping University (LiU/LiTH), Sweden.


J.S. Karlsson, S. Flodin, K. Orsborn, T. Risch, M. Sköld, M. Werner (1994)
Amos.v1 User's Guide .
CAELAB Memo 94-01, Department of Computer and Information Science, Linköping University, Sweden, March 1994.

Abstract: AMOS(Active Mediating Object System) is an Object-Relational database system. AMOS differs from the first generation Object-Oriented (OO) databases in that a relationally complete query language AMOSQL, is available which is more general tahn relational query languages, such as SQL. Furthermore, AMOS is a main-memory database system, since the design of AMOS is optimized for efficient execution assuming that the entire database fits in main memory. For persistence, the system provides primitives for logging and saving and restarting the database from disk. AMOS is implemented in C and runs on HP and SUN Unix platforms. This manual descripes how to use AMOSQL-query language. For interfaces to C, Lisp, and description of some internals, see AMOS Ssytems Manual.



Jonas S. Karlsson
Last modified: Wed Dec 22 19:55:59 CET 1999