WorldCat Identities

Song, S. W.

Works: 9 works in 13 publications in 1 language and 22 library holdings
Roles: Author
Classifications: QA76.9.D3, 510.7808
Publication Timeline
Most widely held works by S. W Song
An efficient parallel garbage collection system and its correctness proof by H. T Kung( Book )

3 editions published in 1977 in English and Undetermined and held by 9 WorldCat member libraries worldwide

An efficient system to perform garbage collection in parallel with list operations is proposed and its correctness is proven. The system consists of two independent processes sharing a common memory. One process is performed by the list processor (LP) for list processing and the other by the garbage collector (GC) for marking active nodes and collecting garbage nodes. The system is derived by using both the correctness and efficiency arguments. Assuming that memory references are indivisible the system satisfies the following properties: No critical sections are needed in the entire system. The time to perform the marking phase by the GC is independent of the size of memory, but depends only on the number of active nodes. Nodes on the free list need not be marked during the marking phase by the GC. Minimum overheads are introduced to the LP. Only two extra bits for encoding four colors are needed for each node. Efficiency results show that the parallel system is usually significantly more efficient in terms of storage and time than the sequential stack algorithm. (Author)
A systolic 2-D convolution chip by H. T Kung( Book )

3 editions published in 1981 in English and held by 6 WorldCat member libraries worldwide

This paper describes a chip for performing the 2-D (two-dimensional) convolution in signal and image processing. The chip, based on a systolic design, consists of essentially only one type of simple cells, which are mesh-interconnected in a regular and modular way, and achieves high performance through extensive concurrent and pipelined use of these cells. Denoting by u the cycle time of the basic cell, the chip allows convolving a kxk window with an nxn image in O(sq m)(u/k) time, using a total of cu k basic cells. The total number of cells is optimal in the sense that the usual sequential algorithm takes O(sq m)(sq k)(u) time. Furthermore, because of the modularity of the design, the number of cells used by the chip can be easily adjusted to achieve any desirable balance between I/O and computation speeds. (Author)
On a high-performance VLSI solution to database problems by Siang Wun Song( Book )

1 edition published in 1981 in English and held by 1 WorldCat member library worldwide

This thesis explores the design and use of custom-made VLSI hardware in the area of database problems. Our effort differs from most previous ones in that we search for structures and algorithms, directly implementable on silicon, for the solution of computation-intensive database problems. The types of target database systems include the general database management systems and the design database systems. The thesis deals mainly with database systems of the relational model. One common view concerning special-purpose hardware usage is that it performs a specific task. The proposed device is not a hardware solution to a specific problem, but provides a number of useful data structures and basic operations. It can be used to improve the performance of any sequential algorithm which makes extensive use of such data structures and basic operations. The design is based on a few basic cells, interconnected together in the form of a complete binary tree. The proposed device can handle all the basic relational operations: select, join, project, union, and intersection. With a special-purpose device of limited size attached to a host, the overall performance may ultimately be dictated by the I/O between the two sites. The ideal special-purpose device design is one that achieves a balance between computation and I/O. We propose a model to study the I/O complexity for sorting n numbers with any special-purpose hardware device of size s, and show a lower bound result of omega (n log n/log s). We present an optimal design achieving this bound. An important finding is that for practical ranges on the quantity of data to be sorted, systolic sorting devices of small sizes can beat fast sequential sorting algorithms
Achieving optimality for gate matrix layout and PLA folding: a graph theoretic approach by Afonso Ferreira( Book )

1 edition published in 1992 in English and held by 1 WorldCat member library worldwide

A parallel algorithm for transitive closure by E. N Cáceres( Book )

1 edition published in 2002 in English and held by 1 WorldCat member library worldwide

Abstract: "We present a parallel algorithm for the problem of computing the transitive closure for an acyclic digraph D with n vertices and m edges. We use the BSP/CGM model of parallel computing. Our algorithm uses O(log p) rounds of communications with p processors, where p [<or =] n, and each processor has O(mn/p) local memory. The local computation of each processor is equal to the product of the number of edges and vertices of D that are stored in p."
Revisiting cycle shrinking by Y Robert( Book )

1 edition published in 1991 in English and held by 1 WorldCat member library worldwide

A highly configurable architecture for systolic arrays of powerful processors by Universidade de São Paulo( Book )

1 edition published in 1990 in English and held by 1 WorldCat member library worldwide

Sequential and parallel algorithms for the all-substrings longest common subsequence problem by Carlos Eduardo Rodrigues Alves( Book )

1 edition published in 2003 in English and held by 1 WorldCat member library worldwide

Abstract: "Given two strings A and B of lengths n[subscript a] and n[subscript b], respectively, the All-substrings Longest Common Subsequence (ALCS) problem obtains, for any substring B ́of B, the length of the longest string that is a subsequence of both A and B.́ The sequential algorithm takes O(n[subscript a]n[subscript b]) time and O(n[subscript b]) space. We present a parallel algorithm for the ALCS on the Coarse Grained Multicomputer (BSP/CGM) model with p <[square root of m] processors, that takes O(n[subscript a]n[subscript b]/p time and O(n[subscript b[square root of n[subscript a]] space per processor, with O(log p) communication rounds. The proposed algorithm also solves the basic Longest Common Subsequence (LCS) Problem that finds the longest string (and not only its length) that is a subsequence of both A and B. To our knowledge, this is the best BSP/CGM algorithm for the LCS and ALCS problems in the literature."
Revisiting hamiltoniam decomposition of the hypercube by Kunio Okuda( Book )

1 edition published in 1998 in English and held by 1 WorldCat member library worldwide

Abstract: "The Hamiltonian decomposition of a hypercube or binary n-cube is the partitioning of its edge set into Hamiltonian cycles. It is known that there are [n/2] disjoint Hamiltonian cycles on a binary n-cube. The proof of this result, however, does not give rise to any simple construction algorithm of such cycles. In a previous work Song presents ideas towards a simple and interesting method to this problem. Two phases are involved. First decompose the binary n-cube into cycles of length 16, C₁₆, and then apply a merge operator to join the C₁₆ cycles into larger Hamiltonian cycles. The case of dimension n=6 (a 64-node hypercube) is illustrated. He conjectures the method can be generalized for any even n. In this paper, we generalize the first phase of that method for any even n and prove its correctness. Also we show four possible merge operators for the case of n=8 (a 256-node hypercube). This result can be viewed as a step toward the general merge operator, thus proving the conjecture."
Audience Level
Audience Level
  Kids General Special  
Audience level: 0.75 (from 0.53 for A parallel ... to 0.92 for Revisiting ...)

English (12)