仓颉-Collection 类型

发表于 2026-01-03 分类于编程语言，仓颉

基础 Collection 类型概述

本章介绍仓颉语言中常用的几种基础 Collection 类型，包括 Array、ArrayList、HashSet 和 HashMap。

可以在不同的场景中选择适合对应业务的类型：

Array：不需要增加和删除元素，但需要修改元素
ArrayList：需要频繁对元素增删查改
HashSet：希望每个元素都是唯一的
HashMap：希望存储一系列的映射关系

下表是这些类型的基础特性：

类型名称	元素可变	增删元素	元素唯一性	有序序列
`Array<T>`	Y	N	N	Y
`ArrayList<T>`	Y	Y	N	Y
`HashSet<T>`	N	Y	Y	N
`HashMap<K, V>`	K: N, V: Y	Y	K: Y, V: N	N

ArrayList

使用 ArrayList 类型需要导入 collection 包：

1	import std.collection.*

仓颉使用 ArrayList<T> 表示 ArrayList 类型，T 表示 ArrayList 的元素类型，T 可以是任意类型。

ArrayList 具备非常好的扩容能力，适合于需要频繁增加和删除元素的场景。

相比 Array，ArrayList 既可以原地修改元素，也可以原地增加和删除元素。

ArrayList 的可变性是一个非常有用的特征，可以让同一个 ArrayList 实例的所有引用都共享同样的元素，并且对它们统一进行修改。

1 2	var a: ArrayList<Int64> = ... // ArrayList whose element type is Int64 var b: ArrayList<String> = ... // ArrayList whose element type is String

元素类型不相同的 ArrayList 是不相同的类型，所以它们之间不可以互相赋值。

因此以下例子是不合法的。

1	b = a // Type mismatch

仓颉中可以使用构造函数的方式构造一个指定的 ArrayList。

let a = ArrayList<String>() // Created an empty ArrayList whose element type is String
let b = ArrayList<String>(100) // Created an ArrayList whose element type is String, and allocate a space of 100
let c = ArrayList<Int64>([0, 1, 2]) // Created an ArrayList whose element type is Int64, containing elements 0, 1, 2
let d = ArrayList<Int64>(c) // Use another Collection to initialize an ArrayList
let e = ArrayList<String>(2, {x: Int64 => x.toString()}) // Created an ArrayList whose element type is String and size is 2. All elements are initialized by specified rule function

访问 ArrayList 成员

当需要对 ArrayList 的所有元素进行访问时，可以使用 for-in 循环遍历 ArrayList 的所有元素。

import std.collection.*

main() {
    let list = ArrayList<Int64>([0, 1, 2])
    for (i in list) {
        println("The element is ${i}")
    }
}

编译并执行上面的代码，会输出：

1
2
3

The element is 0
The element is 1
The element is 2

当需要知道某个 ArrayList 包含的元素个数时，可以使用 size 属性获得对应信息。

import std.collection.*

main() {
    let list = ArrayList<Int64>([0, 1, 2])
    if (list.size == 0) {
        println("This is an empty arraylist")
    } else {
        println("The size of arraylist is ${list.size}")
    }
}

编译并执行上面的代码，会输出：

1	The size of arraylist is 3

当想访问单个指定位置的元素时，可以使用下标语法访问（下标的类型必须是 Int64）。非空 ArrayList 的第一个元素总是从位置 0 开始的。可以从 0 开始访问 ArrayList 的任意一个元素，直到最后一个位置（ArrayList 的 size - 1）。使用负数或大于等于 size 的索引会触发运行时异常。

1
2
3

let a = list[0] // a == 0
let b = list[1] // b == 1
let c = list[-1] // Runtime exceptions

ArrayList 也支持下标中使用 Range 的语法，详见 Array 章节。

修改 ArrayList

可以使用下标语法对某个位置的元素进行修改。

1 2	let list = ArrayList<Int64>([0, 1, 2]) list[0] = 3

ArrayList 是引用类型，ArrayList 在作为表达式使用时不会拷贝副本，同一个 ArrayList 实例的所有引用都会共享同样的数据。

因此对 ArrayList 元素的修改会影响到该实例的所有引用。

let list1 = ArrayList<Int64>([0, 1, 2])
let list2 = list1
list2[0] = 3
// list1 contains elements 3, 1, 2
// list2 contains elements 3, 1, 2

如果需要将单个元素添加到 ArrayList 的末尾，请使用 add 函数。如果希望同时添加多个元素到末尾，可以使用 add(all!: Collection<T>) 函数，这个函数可以接受其他相同元素类型的 Collection 类型（例如 Array）。

import std.collection.*

main() {
    let list = ArrayList<Int64>()
    list.add(0) // list contains element 0
    list.add(1) // list contains elements 0, 1
    let li = [2, 3]
    list.add(all: li) // list contains elements 0, 1, 2, 3
}

可以通过 add(T, at!: Int64) 和 add(all!: Collection<T>, at!: Int64) 函数将指定的单个元素或相同元素类型的 Collection 值插入到指定索引的位置。该索引处的元素和后面的元素会被挪后以腾出空间。

1 2	let list = ArrayList<Int64>([0, 1, 2]) // list contains elements 0, 1, 2 list.add(4, at: 1) // list contains elements 0, 4, 1, 2

从 ArrayList 中删除元素，可以使用 remove 函数，需要指定删除的索引。该索引处后面的元素会前移以填充空间。

1
2

let list = ArrayList<String>(["a", "b", "c", "d"]) // list contains the elements "a", "b", "c", "d"
list.remove(at: 1) // Delete the element at subscript 1, now the list contains elements "a", "c", "d"

增加 ArrayList 的大小

每个 ArrayList 都需要特定数量的内存来保存其内容。当向 ArrayList 添加元素并且该 ArrayList 开始超出其保留容量时，该 ArrayList 会分配更大的内存区域并将其所有元素复制到新内存中。这种增长策略意味着触发重新分配内存的添加操作具有性能成本，但随着 ArrayList 的保留内存变大，它们发生的频率会越来越低。

如果知道大约需要添加多少个元素，可以在添加之前预备足够的内存以避免中间重新分配，这样可以提升性能表现。

import std.collection.*

main() {
    let list = ArrayList<Int64>(100) // Allocate space at once
    for (i in 0..100) {
        list.add(i) // Does not trigger reallocation of space
    }
    list.reserve(100) // Prepare more space
    for (i in 0..100) {
        list.add(i) // Does not trigger reallocation of space
    }
}

HashSet

使用 HashSet 类型需要导入 collection 包：

1	import std.collection.*

可以使用 HashSet 类型来构造只拥有不重复元素的 Collection。

仓颉使用 HashSet<T> 表示 HashSet 类型，T 表示 HashSet 的元素类型，T 必须是实现了 Hashable 和 Equatable<T> 接口的类型，例如数值或 String。

1 2	var a: HashSet<Int64> = ... // HashSet whose element type is Int64 var b: HashSet<String> = ... // HashSet whose element type is String

元素类型不相同的 HashSet 是不相同的类型，所以它们之间不可以互相赋值。

因此以下例子是不合法的。

1	b = a // Type mismatch

仓颉中可以使用构造函数的方式构造一个指定的 HashSet。

let a = HashSet<String>() // Created an empty HashSet whose element type is String
let b = HashSet<String>(100) // Created a HashSet whose capacity is 100
let c = HashSet<Int64>([0, 1, 2]) // Created a HashSet whose element type is Int64, containing elements 0, 1, 2
let d = HashSet<Int64>(c) // Use another Collection to initialize a HashSet
let e = HashSet<Int64>(10, {x: Int64 => (x * x)}) // Created a HashSet whose element type is Int64 and size is 10. All elements are initialized by specified rule function

访问 HashSet 成员

当需要对 HashSet 的所有元素进行访问时，可以使用 for-in 循环遍历 HashSet 的所有元素。

需要注意的是，HashSet 并不保证按插入元素的顺序排列，因此遍历的顺序和插入的顺序可能不同。

import std.collection.*

main() {
    let mySet = HashSet<Int64>([0, 1, 2])
    for (i in mySet) {
        println("The element is ${i}")
    }
}

编译并执行上面的代码，有可能会输出：

1
2
3

The element is 0
The element is 1
The element is 2

当需要知道某个 HashSet 包含的元素个数时，可以使用 size 属性获得对应信息。

import std.collection.*

main() {
    let mySet = HashSet<Int64>([0, 1, 2])
    if (mySet.size == 0) {
        println("This is an empty hashset")
    } else {
        println("The size of hashset is ${mySet.size}")
    }
}

编译并执行上面的代码，会输出：

1	The size of hashset is 3

当想判断某个元素是否被包含在某个 HashSet 中时，可以使用 contains 函数。如果该元素存在会返回 true，否则返回 false。

1
2
3

let mySet = HashSet<Int64>([0, 1, 2])
let a = mySet.contains(0) // a == true
let b = mySet.contains(-1) // b == false

修改 HashSet

HashSet 是一种可变的引用类型，HashSet 类型提供了添加元素、删除元素的功能。

HashSet 的可变性是一个非常有用的特征，可以让同一个 HashSet 实例的所有引用都共享同样的元素，并且对它们统一进行修改。

如果需要将单个元素添加到 HashSet 里，请使用 add 函数。如果希望同时添加多个元素，可以使用 add(all!: Collection<T>) 函数，这个函数可以接受另一个相同元素类型的 Collection 类型（例如 Array）。当元素不存在时，add 函数会执行添加的操作，当 HashSet 中存在相同元素时，add 函数将不会有效果。

let mySet = HashSet<Int64>()
mySet.add(0) // mySet contains elements 0
mySet.add(0) // mySet contains elements 0
mySet.add(1) // mySet contains elements 0, 1
let li = [2, 3]
mySet.add(all: li) // mySet contains elements 0, 1, 2, 3

HashSet 是引用类型，HashSet 在作为表达式使用时不会拷贝副本，同一个 HashSet 实例的所有引用都会共享同样的数据。

因此对 HashSet 元素的修改会影响到该实例的所有引用。

let set1 = HashSet<Int64>([0, 1, 2])
let set2 = set1
set2.add(3)
// set1 contains elements 0, 1, 2, 3
// set2 contains elements 0, 1, 2, 3

从 HashSet 中删除元素，可以使用 remove 函数，需要指定删除的元素。

1 2	let mySet = HashSet<Int64>([0, 1, 2, 3]) mySet.remove(1) // mySet contains elements 0, 2, 3

HashMap

使用 HashMap 类型需要导入 collection 包：

1	import std.collection.*

可以使用 HashMap 类型来构造元素为键值对的 Collection。

HashMap 是一种哈希表，提供对其包含的元素的快速访问。表中的每个元素都使用其键作为标识，可以使用键来访问相应的值。

仓颉使用 HashMap<K, V> 表示 HashMap 类型，K 表示 HashMap 的键类型，K 必须是实现了 Hashable 和 Equatable<K> 接口的类型，例如数值或 String。V 表示 HashMap 的值类型，V 可以是任意类型。

1 2	var a: HashMap<Int64, Int64> = ... // HashMap whose key type is Int64 and value type is Int64 var b: HashMap<String, Int64> = ... // HashMap whose key type is String and value type is Int64

元素类型不相同的 HashMap 是不相同的类型，所以它们之间不可以互相赋值。

因此以下例子是不合法的。

1	b = a // Type mismatch

仓颉中可以使用构造函数的方式构造一个指定的 HashMap。

let a = HashMap<String, Int64>() // Created an empty HashMap whose key type is String and value type is Int64
let b = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)]) // whose key type is String and value type is Int64, containing elements ("a", 0), ("b", 1), ("c", 2)
let c = HashMap<String, Int64>(b) // Use another Collection to initialize a HashMap
let d = HashMap<String, Int64>(10) // Created a HashMap whose key type is String and value type is Int64 and capacity is 10
let e = HashMap<Int64, Int64>(10, {x: Int64 => (x, x * x)}) // Created a HashMap whose key and value type is Int64 and size is 10. All elements are initialized by specified rule function

访问 HashMap 成员

当需要对 HashMap 的所有元素进行访问时，可以使用 for-in 循环遍历 HashMap 的所有元素。

需要注意的是，HashMap 并不保证按插入元素的顺序排列，因此遍历的顺序和插入的顺序可能不同。

import std.collection.*

main() {
    let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)])
    for ((k, v) in map) {
        println("The key is ${k}, the value is ${v}")
    }
}

编译并执行上面的代码，有可能会输出：

1
2
3

The key is a, the value is 0
The key is b, the value is 1
The key is c, the value is 2

当需要知道某个 HashMap 包含的元素个数时，可以使用 size 属性获得对应信息。

import std.collection.*

main() {
    let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)])
    if (map.size == 0) {
        println("This is an empty hashmap")
    } else {
        println("The size of hashmap is ${map.size}")
    }
}

编译并执行上面的代码，会输出：

1	The size of hashmap is 3

当想判断 HashMap 中是否包含某个键时，可以使用 contains 函数。如果该键存在会返回 true，否则返回 false。

1
2
3

let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)])
let a = map.contains("a") // a == true
let b = map.contains("d") // b == false

当想访问指定键对应的元素时，可以使用下标语法访问（下标的类型必须是键类型）。使用不存在的键作为索引会触发运行时异常。

let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)])
let a = map["a"] // a == 0
let b = map["b"] // b == 1
let c = map["d"] // Runtime exceptions

修改 HashMap

HashMap 是一种可变的引用类型，HashMap 类型提供了修改元素、添加元素、删除元素的功能。

HashMap 的可变性是一个非常有用的特征，可以让同一个 HashMap 实例的所有引用都共享同样的元素，并且对它们统一进行修改。

可以使用下标语法对某个键对应的值进行修改。

1 2	let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)]) map["a"] = 3

HashMap 是引用类型，HashMap 在作为表达式使用时不会拷贝副本，同一个 HashMap 实例的所有引用都会共享同样的数据。

因此对 HashMap 元素的修改会影响到该实例的所有引用。

let map1 = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)])
let map2 = map1
map2["a"] = 3
// map1 contains the elements ("a", 3), ("b", 1), ("c", 2)
// map2 contains the elements ("a", 3), ("b", 1), ("c", 2)

如果需要将单个键值对添加到 HashMap 里，请使用 add 函数。如果希望同时添加多个键值对，可以使用 add(all!: Collection<(K, V)>) 函数。当键不存在时，add 函数会执行添加的操作，当键存在时，add 函数会将新的值覆盖旧的值。

let map = HashMap<String, Int64>()
map.add("a", 0) // map contains the element ("a", 0)
map.add("b", 1) // map contains the elements ("a", 0), ("b", 1)
let map2 = HashMap<String, Int64>([("c", 2), ("d", 3)])
map.add(all: map2) // map contains the elements ("a", 0), ("b", 1), ("c", 2), ("d", 3)

除了使用 add 函数以外，也可以使用赋值的方式直接将新的键值对添加到 HashMap。

1 2	let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2)]) map["d"] = 3 // map contains the elements ("a", 0), ("b", 1), ("c", 2), ("d", 3)

从 HashMap 中删除元素，可以使用 remove 函数，需要指定删除的键。

1 2	let map = HashMap<String, Int64>([("a", 0), ("b", 1), ("c", 2), ("d", 3)]) map.remove("d") // map contains the elements ("a", 0), ("b", 1), ("c", 2)

Iterable 和 Collections

前面已经了解过 Range、Array、ArrayList，它们都可以使用 for-in 进行遍历操作。对于开发者自定义的类型，也能实现类似的遍历操作。

Range、Array、ArrayList 都是通过 Iterable 来支持 for-in 语法的。

Iterable 是如下形式（只展示了核心代码）的一个内置 interface。

interface Iterable<T> {
    func iterator(): Iterator<T>
    ...
}

iterator 函数要求返回的 Iterator 类型是如下形式（只展示了核心代码）的另一个内置 interface。

interface Iterator<T> <: Iterable<T> {
    mut func next(): Option<T>
    ...
}

可以使用 for-in 语法来遍历任何一个实现了 Iterable 接口类型的实例。

假设有这样一段 for-in 代码，如下所示。

let list = [1, 2, 3]
for (i in list) {
    println(i)
}

它等价于如下形式的 while 代码。

let list = [1, 2, 3]
var it = list.iterator()
while (true) {
    match (it.next()) {
        case Some(i) => println(i)
        case None => break
    }
}

另外一种常见的遍历 Iterable 类型的方法是在 while 表达式的条件中使用模式匹配，比如上面 while 代码的另一种等价写法是：

let list = [1, 2, 3]
var it = list.iterator()
while (let Some(i) <- it.next()) {
    println(i)
}

Array、ArrayList、HashSet、HashMap 类型都实现了 Iterable，因此可以将其用在 for-in 或者 while 中。