Subsections of 📃Articles

Subsections of BuckUp

ES [Local Disk]

Preliminary

  • ElasticSearch has installed, if not check link

  • The elasticsearch.yml has configed path.repo, which should be set the same value of settings.location (this will be handled by helm chart, dont worry)

    ES argocd-app yaml
    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
      name: elastic-search
    spec:
      syncPolicy:
        syncOptions:
        - CreateNamespace=true
      project: default
      source:
        repoURL: https://charts.bitnami.com/bitnami
        chart: elasticsearch
        targetRevision: 19.11.3
        helm:
          releaseName: elastic-search
          values: |
            global:
              kibanaEnabled: true
            clusterName: elastic
            image:
              registry: m.zjvis.net/docker.io
              pullPolicy: IfNotPresent
            security:
              enabled: false
            service:
              type: ClusterIP
            extraConfig:
              path:
                repo: /tmp
            ingress:
              enabled: true
              annotations:
                cert-manager.io/cluster-issuer: self-signed-ca-issuer
                nginx.ingress.kubernetes.io/rewrite-target: /$1
              hostname: elastic-search.dev.tech
              ingressClassName: nginx
              path: /?(.*)
              tls: true
            master:
              masterOnly: false
              replicaCount: 1
              persistence:
                enabled: false
              resources:
                requests:
                  cpu: 2
                  memory: 1024Mi
                limits:
                  cpu: 4
                  memory: 4096Mi
              heapSize: 2g
            data:
              replicaCount: 0
              persistence:
                enabled: false
            coordinating:
              replicaCount: 0
            ingest:
              enabled: true
              replicaCount: 0
              service:
                enabled: false
                type: ClusterIP
              ingress:
                enabled: false
            metrics:
              enabled: false
              image:
                registry: m.zjvis.net/docker.io
                pullPolicy: IfNotPresent
            volumePermissions:
              enabled: false
              image:
                registry: m.zjvis.net/docker.io
                pullPolicy: IfNotPresent
            sysctlImage:
              enabled: true
              registry: m.zjvis.net/docker.io
              pullPolicy: IfNotPresent
            kibana:
              elasticsearch:
                hosts:
                  - '{{ include "elasticsearch.service.name" . }}'
                port: '{{ include "elasticsearch.service.ports.restAPI" . }}'
            esJavaOpts: "-Xmx2g -Xms2g"        
      destination:
        server: https://kubernetes.default.svc
        namespace: application

    diff from oirginal file :

    extraConfig:
        path:
          repo: /tmp

Methods

Elasticsearch 做备份有两种方式,

  1. 是将数据导出成文本文件,比如通过elasticdump、esm等工具将存储在 Elasticsearch 中的数据导出到文件中。
  2. 是使用snapshot接口实现快照功能,增量备份文件

第一种方式相对简单,在数据量小的时候比较实用,但当应对大数据量场景时,更推荐使用snapshot api 的方式。

Steps

buckup

asdadas

  1. 创建快照仓库repo -> my_fs_repository
curl -k -X PUT "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "fs",
  "settings": {
    "location": "/tmp"
  }
}
'

你也能使用storage-class 挂载一个路径在pod中,将snapshot文件存放在外挂路径上

  1. 验证集群各个节点是否可以使用这个快照仓库repo
curl -k -X POST "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository/_verify?pretty"
  1. 查看快照仓库repo
curl -k -X GET "https://elastic-search.dev.tech:32443/_snapshot/_all?pretty"
  1. 查看某一个快照仓库repo的具体setting
curl -k -X GET "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository?pretty"
  1. 分析一个快照仓库repo
curl -k -X POST "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository/_analyze?blob_count=10&max_blob_size=1mb&timeout=120s&pretty"
  1. 手动打快照
curl -k -X PUT "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository/ay_snap_02?pretty"
  1. 查看指定快照仓库repo 可用的快照
curl -k -X GET "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository/*?verbose=false&pretty"
  1. 测试恢复
# Delete an index
curl -k -X DELETE "https://elastic-search.dev.tech:32443/books?pretty"

# restore that index
curl -k -X POST "https://elastic-search.dev.tech:32443/_snapshot/my_fs_repository/ay_snap_02/_restore?pretty" -H 'Content-Type: application/json' -d'
{
  "indices": "books"
}
'

# query
curl -k -X GET "https://elastic-search.dev.tech:32443/books/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match_all": {}
  }
}
'
Oct 7, 2024

ES [S3 Compatible]

Preliminary

  • ElasticSearch has installed, if not check link

    ES argocd-app yaml
    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
      name: elastic-search
    spec:
      syncPolicy:
        syncOptions:
        - CreateNamespace=true
      project: default
      source:
        repoURL: https://charts.bitnami.com/bitnami
        chart: elasticsearch
        targetRevision: 19.11.3
        helm:
          releaseName: elastic-search
          values: |
            global:
              kibanaEnabled: true
            clusterName: elastic
            image:
              registry: m.zjvis.net/docker.io
              pullPolicy: IfNotPresent
            security:
              enabled: true
            service:
              type: ClusterIP
            extraEnvVars:
            - name: S3_ACCESSKEY
              value: admin
            - name: S3_SECRETKEY
              value: ZrwpsezF1Lt85dxl
            extraConfig:
              s3:
                client:
                  default:
                    protocol: http
                    endpoint: "http://192.168.31.111:9090"
                    path_style_access: true
            initScripts:
              configure-s3-client.sh: |
                elasticsearch_set_key_value "s3.client.default.access_key" "${S3_ACCESSKEY}"
                elasticsearch_set_key_value "s3.client.default.secret_key" "${S3_SECRETKEY}"
            hostAliases:
            - ip: 192.168.31.111
              hostnames:
              - minio-api.dev.tech
            ingress:
              enabled: true
              annotations:
                cert-manager.io/cluster-issuer: self-signed-ca-issuer
                nginx.ingress.kubernetes.io/rewrite-target: /$1
              hostname: elastic-search.dev.tech
              ingressClassName: nginx
              path: /?(.*)
              tls: true
            master:
              masterOnly: false
              replicaCount: 1
              persistence:
                enabled: false
              resources:
                requests:
                  cpu: 2
                  memory: 1024Mi
                limits:
                  cpu: 4
                  memory: 4096Mi
              heapSize: 2g
            data:
              replicaCount: 0
              persistence:
                enabled: false
            coordinating:
              replicaCount: 0
            ingest:
              enabled: true
              replicaCount: 0
              service:
                enabled: false
                type: ClusterIP
              ingress:
                enabled: false
            metrics:
              enabled: false
              image:
                registry: m.zjvis.net/docker.io
                pullPolicy: IfNotPresent
            volumePermissions:
              enabled: false
              image:
                registry: m.zjvis.net/docker.io
                pullPolicy: IfNotPresent
            sysctlImage:
              enabled: true
              registry: m.zjvis.net/docker.io
              pullPolicy: IfNotPresent
            kibana:
              elasticsearch:
                hosts:
                  - '{{ include "elasticsearch.service.name" . }}'
                port: '{{ include "elasticsearch.service.ports.restAPI" . }}'
            esJavaOpts: "-Xmx2g -Xms2g"        
      destination:
        server: https://kubernetes.default.svc
        namespace: application

    diff from oirginal file :

    extraEnvVars:
    - name: S3_ACCESSKEY
      value: admin
    - name: S3_SECRETKEY
      value: ZrwpsezF1Lt85dxl
    extraConfig:
      s3:
        client:
          default:
            protocol: http
            endpoint: "http://192.168.31.111:9090"
            path_style_access: true
    initScripts:
      configure-s3-client.sh: |
        elasticsearch_set_key_value "s3.client.default.access_key" "${S3_ACCESSKEY}"
        elasticsearch_set_key_value "s3.client.default.secret_key" "${S3_SECRETKEY}"
    hostAliases:
    - ip: 192.168.31.111
      hostnames:
      - minio-api.dev.tech

Methods

Elasticsearch 做备份有两种方式,

  1. 是将数据导出成文本文件,比如通过elasticdump、esm等工具将存储在 Elasticsearch 中的数据导出到文件中。
  2. 是使用snapshot接口实现快照功能,增量备份文件

第一种方式相对简单,在数据量小的时候比较实用,但当应对大数据量场景时,更推荐使用snapshot api 的方式。

Steps

buckup

asdadas

  1. 创建快照仓库repo -> my_s3_repository
curl -k -X PUT "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "s3",
  "settings": {
    "bucket": "local-test",
    "client": "default",
    "endpoint": "http://192.168.31.111:9000"
  }
}
'

你也能使用storage-class 挂载一个路径在pod中,将snapshot文件存放在外挂路径上

  1. 验证集群各个节点是否可以使用这个快照仓库repo
curl -k -X POST "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository/_verify?pretty"
  1. 查看快照仓库repo
curl -k -X GET "https://elastic-search.dev.tech:32443/_snapshot/_all?pretty"
  1. 查看某一个快照仓库repo的具体setting
curl -k -X GET "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository?pretty"
  1. 分析一个快照仓库repo
curl -k -X POST "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository/_analyze?blob_count=10&max_blob_size=1mb&timeout=120s&pretty"
  1. 手动打快照
curl -k -X PUT "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository/ay_s3_snap_02?pretty"
  1. 查看指定快照仓库repo 可用的快照
curl -k -X GET "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository/*?verbose=false&pretty"
  1. 测试恢复
# Delete an index
curl -k -X DELETE "https://elastic-search.dev.tech:32443/books?pretty"

# restore that index
curl -k -X POST "https://elastic-search.dev.tech:32443/_snapshot/my_s3_repository/ay_s3_snap_02/_restore?pretty" -H 'Content-Type: application/json' -d'
{
  "indices": "books"
}
'

# query
curl -k -X GET "https://elastic-search.dev.tech:32443/books/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match_all": {}
  }
}
'
Oct 7, 2024

ES Auto BackUp

Preliminary

  • ElasticSearch has installed, if not check link

  • We use local disk to save the snapshots, more deatils check link

  • And the security is enabled.

    ES argocd-app yaml
    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
      name: elastic-search
    spec:
      syncPolicy:
        syncOptions:
        - CreateNamespace=true
      project: default
      source:
        repoURL: https://charts.bitnami.com/bitnami
        chart: elasticsearch
        targetRevision: 19.11.3
        helm:
          releaseName: elastic-search
          values: |
            global:
              kibanaEnabled: true
            clusterName: elastic
            image:
              registry: m.zjvis.net/docker.io
              pullPolicy: IfNotPresent
            security:
              enabled: true
              tls:
                autoGenerated: true
            service:
              type: ClusterIP
            extraConfig:
              path:
                repo: /tmp
            ingress:
              enabled: true
              annotations:
                cert-manager.io/cluster-issuer: self-signed-ca-issuer
                nginx.ingress.kubernetes.io/rewrite-target: /$1
              hostname: elastic-search.dev.tech
              ingressClassName: nginx
              path: /?(.*)
              tls: true
            master:
              masterOnly: false
              replicaCount: 1
              persistence:
                enabled: false
              resources:
                requests:
                  cpu: 2
                  memory: 1024Mi
                limits:
                  cpu: 4
                  memory: 4096Mi
              heapSize: 2g
            data:
              replicaCount: 0
              persistence:
                enabled: false
            coordinating:
              replicaCount: 0
            ingest:
              enabled: true
              replicaCount: 0
              service:
                enabled: false
                type: ClusterIP
              ingress:
                enabled: false
            metrics:
              enabled: false
              image:
                registry: m.zjvis.net/docker.io
                pullPolicy: IfNotPresent
            volumePermissions:
              enabled: false
              image:
                registry: m.zjvis.net/docker.io
                pullPolicy: IfNotPresent
            sysctlImage:
              enabled: true
              registry: m.zjvis.net/docker.io
              pullPolicy: IfNotPresent
            kibana:
              elasticsearch:
                hosts:
                  - '{{ include "elasticsearch.service.name" . }}'
                port: '{{ include "elasticsearch.service.ports.restAPI" . }}'
            esJavaOpts: "-Xmx2g -Xms2g"        
      destination:
        server: https://kubernetes.default.svc
        namespace: application

    diff from oirginal file :

    security:
      enabled: true
    extraConfig:
        path:
          repo: /tmp

Methods

Steps

auto buckup
  1. 创建快照仓库repo -> slm_fs_repository
curl --user elastic:L9shjg6csBmPZgCZ -k -X PUT "https://10.88.0.143:30294/_snapshot/slm_fs_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "fs",
  "settings": {
    "location": "/tmp"
  }
}
'

你也能使用storage-class 挂载一个路径在pod中,将snapshot文件存放在外挂路径上

  1. 验证集群各个节点是否可以使用这个快照仓库repo
curl --user elastic:L9shjg6csBmPZgCZ  -k -X POST "https://10.88.0.143:30294/_snapshot/slm_fs_repository/_verify?pretty"
  1. 查看快照仓库repo
curl --user elastic:L9shjg6csBmPZgCZ  -k -X GET "https://10.88.0.143:30294/_snapshot/_all?pretty"
  1. 查看某一个快照仓库repo的具体setting
curl --user elastic:L9shjg6csBmPZgCZ  -k -X GET "https://10.88.0.143:30294/_snapshot/slm_fs_repository?pretty"
  1. 分析一个快照仓库repo
curl --user elastic:L9shjg6csBmPZgCZ  -k -X POST "https://10.88.0.143:30294/_snapshot/slm_fs_repository/_analyze?blob_count=10&max_blob_size=1mb&timeout=120s&pretty"
  1. 查看指定快照仓库repo 可用的快照
curl --user elastic:L9shjg6csBmPZgCZ  -k -X GET "https://10.88.0.143:30294/_snapshot/slm_fs_repository/*?verbose=false&pretty"
  1. 创建SLM admin 角色
curl --user elastic:L9shjg6csBmPZgCZ -k -X POST "https://10.88.0.143:30294/_security/role/slm-admin?pretty" -H 'Content-Type: application/json' -d'
{
  "cluster": [ "manage_slm", "cluster:admin/snapshot/*" ],
  "indices": [
    {
      "names": [ ".slm-history-*" ],
      "privileges": [ "all" ]
    }
  ]
}
'
  1. 创建自动备份cornjob
curl --user elastic:L9shjg6csBmPZgCZ -k -X PUT "https://10.88.0.143:30294/_slm/policy/nightly-snapshots?pretty" -H 'Content-Type: application/json' -d'
{
  "schedule": "0 30 1 * * ?",       
  "name": "<nightly-snap-{now/d}>", 
  "repository": "slm_fs_repository",    
  "config": {
    "indices": "*",                 
    "include_global_state": true    
  },
  "retention": {                    
    "expire_after": "30d",
    "min_count": 5,
    "max_count": 50
  }
}
'
  1. 启动自动备份
curl --user elastic:L9shjg6csBmPZgCZ -k -X POST "https://10.88.0.143:30294/_slm/policy/nightly-snapshots/_execute?pretty"
  1. 查看SLM备份历史
curl --user elastic:L9shjg6csBmPZgCZ -k -X GET "https://10.88.0.143:30294/_slm/stats?pretty"
  1. 测试恢复
# Delete an index
curl --user elastic:L9shjg6csBmPZgCZ  -k -X DELETE "https://10.88.0.143:30294/books?pretty"

# restore that index
curl --user elastic:L9shjg6csBmPZgCZ  -k -X POST "https://10.88.0.143:30294/_snapshot/slm_fs_repository/my_snapshot_2099.05.06/_restore?pretty" -H 'Content-Type: application/json' -d'
{
  "indices": "books"
}
'

# query
curl --user elastic:L9shjg6csBmPZgCZ  -k -X GET "https://10.88.0.143:30294/books/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match_all": {}
  }
}
'
Oct 7, 2024

Subsections of Cheat Sheet

Subsections of Aliyun Related

OSSutil

Ali version of Minio(https://min.io/)

download ossutil

first, you need to download ossutil first

OS:
curl https://gosspublic.alicdn.com/ossutil/install.sh  | sudo bash
curl -o ossutil-v1.7.19-windows-386.zip https://gosspublic.alicdn.com/ossutil/1.7.19/ossutil-v1.7.19-windows-386.zip

config ossutil

./ossutil config
ParamsDescriptionInstruction
endpointthe Endpoint of the region where the Bucket is located
accessKeyIDOSS AccessKeyget from user info panel
accessKeySecretOSS AccessKeySecretget from user info panel
stsTokentoken for sts servicecould be empty
Info

and you can also modify /home/<$user>/.ossutilconfig file directly to change the configuration.

list files

ossutil ls oss://<$PATH>
For exmaple
ossutil ls oss://csst-data/CSST-20240312/dfs/

download file/dir

you can use cp to download or upload file

ossutil cp -r oss://<$PATH> <$PTHER_PATH>
For exmaple
ossutil cp -r oss://csst-data/CSST-20240312/dfs/ /data/nfs/data/pvc...

upload file/dir

ossutil cp -r <$SOURCE_PATH> oss://<$PATH>
For exmaple
ossutil cp -r /data/nfs/data/pvc/a.txt  oss://csst-data/CSST-20240312/dfs/b.txt
Mar 24, 2024

ECS DNS

ZJADC (Aliyun Directed Cloud)

Append content in /etc/resolv.conf

options timeout:2 attempts:3 rotate
nameserver 10.255.9.2
nameserver 10.200.12.5

And then you probably need to modify yum.repo.d as well, check link


YQGCY (Aliyun Directed Cloud)

Append content in /etc/resolv.conf

nameserver 172.27.205.79

And then restart kube-system.coredns-xxxx


Google DNS

nameserver 8.8.8.8
nameserver 4.4.4.4
nameserver 223.5.5.5
nameserver 223.6.6.6

Restart DNS

OS:
vim /etc/NetworkManager/NetworkManager.conf
vim /etc/NetworkManager/NetworkManager.conf
sudo systemctl is-active systemd-resolved
sudo resolvectl flush-caches
# or sudo systemd-resolve --flush-caches

add "dns=none" under '[main]' part

systemctl restart NetworkManager

Modify ifcfg-ethX [Optional]

if you cannot get ipv4 address, you can try to modify ifcfg-ethX

vim /etc/sysconfig/network-scripts/ifcfg-ens33

set ONBOOT=yes

Mar 14, 2024

OS Mirrors

Fedora

  • Fedora 40 located in /etc/yum.repos.d/
    Fedora Mirror
    [updates]
    name=Fedora $releasever - $basearch - Updates
    #baseurl=http://download.example/pub/fedora/linux/updates/$releasever/Everything/$basearch/
    metalink=https://mirrors.fedoraproject.org/metalink?repo=updates-released-f$releasever&arch=$basearch
    enabled=1
    countme=1
    repo_gpgcheck=0
    type=rpm
    gpgcheck=1
    metadata_expire=6h
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
    skip_if_unavailable=False
    
    [updates-debuginfo]
    name=Fedora $releasever - $basearch - Updates - Debug
    #baseurl=http://download.example/pub/fedora/linux/updates/$releasever/Everything/$basearch/debug/
    metalink=https://mirrors.fedoraproject.org/metalink?repo=updates-released-debug-f$releasever&arch=$basearch
    enabled=0
    repo_gpgcheck=0
    type=rpm
    gpgcheck=1
    metadata_expire=6h
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
    skip_if_unavailable=False
    
    [updates-source]
    name=Fedora $releasever - Updates Source
    #baseurl=http://download.example/pub/fedora/linux/updates/$releasever/Everything/SRPMS/
    metalink=https://mirrors.fedoraproject.org/metalink?repo=updates-released-source-f$releasever&arch=$basearch
    enabled=0
    repo_gpgcheck=0
    type=rpm
    gpgcheck=1
    metadata_expire=6h
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
    skip_if_unavailable=False

CentOS

  • CentOS 7 located in /etc/yum.repos.d/

    CentOS Mirror
    [base]
    name=CentOS-$releasever
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
    baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
    gpgcheck=1
    gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-7
    
    [extras]
    name=CentOS-$releasever
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras
    baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/
    gpgcheck=1
    gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-7
    Aliyun Mirror
    [base]
    name=CentOS-$releasever - Base - mirrors.aliyun.com
    failovermethod=priority
    baseurl=http://mirrors.aliyun.com/centos/$releasever/os/$basearch/
    gpgcheck=1
    gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
    
    [extras]
    name=CentOS-$releasever - Extras - mirrors.aliyun.com
    failovermethod=priority
    baseurl=http://mirrors.aliyun.com/centos/$releasever/extras/$basearch/
    gpgcheck=1
    gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
    163 Mirror
    [base]
    name=CentOS-$releasever - Base - 163.com
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
    baseurl=http://mirrors.163.com/centos/$releasever/os/$basearch/
    gpgcheck=1
    gpgkey=http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-7
    
    [extras]
    name=CentOS-$releasever - Extras - 163.com
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras
    baseurl=http://mirrors.163.com/centos/$releasever/extras/$basearch/
    gpgcheck=1
    gpgkey=http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-7

  • CentOS 8 stream located in /etc/yum.repos.d/

    CentOS Mirror
    [baseos]
    name=CentOS Linux - BaseOS
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=BaseOS&infra=$infra
    baseurl=http://mirror.centos.org/centos/8-stream/BaseOS/$basearch/os/
    gpgcheck=1
    enabled=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
    
    [extras]
    name=CentOS Linux - Extras
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras&infra=$infra
    baseurl=http://mirror.centos.org/centos/8-stream/extras/$basearch/os/
    gpgcheck=1
    enabled=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
    
    [appstream]
    name=CentOS Linux - AppStream
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=AppStream&infra=$infra
    baseurl=http://mirror.centos.org/centos/8-stream/AppStream/$basearch/os/
    gpgcheck=1
    enabled=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
    Aliyun Mirror
    [base]
    name=CentOS-8.5.2111 - Base - mirrors.aliyun.com
    baseurl=http://mirrors.aliyun.com/centos-vault/8.5.2111/BaseOS/$basearch/os/
    gpgcheck=0
    gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-Official
    
    [extras]
    name=CentOS-8.5.2111 - Extras - mirrors.aliyun.com
    baseurl=http://mirrors.aliyun.com/centos-vault/8.5.2111/extras/$basearch/os/
    gpgcheck=0
    gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-Official
    
    [AppStream]
    name=CentOS-8.5.2111 - AppStream - mirrors.aliyun.com
    baseurl=http://mirrors.aliyun.com/centos-vault/8.5.2111/AppStream/$basearch/os/
    gpgcheck=0
    gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-Official

Ubuntu

  • Ubuntu 18.04 located in /etc/apt/sources.list

    Ubuntu Mirror
    deb http://archive.ubuntu.com/ubuntu/ bionic main restricted
    deb http://archive.ubuntu.com/ubuntu/ bionic-updates main restricted
    deb http://archive.ubuntu.com/ubuntu/ bionic-backports main restricted universe multiverse
    deb http://security.ubuntu.com/ubuntu/ bionic-security main restricted

  • Ubuntu 20.04 located in /etc/apt/sources.list

    Ubuntu Mirror
    deb http://archive.ubuntu.com/ubuntu/ focal main restricted universe multiverse
    deb http://archive.ubuntu.com/ubuntu/ focal-updates main restricted universe multiverse
    deb http://archive.ubuntu.com/ubuntu/ focal-backports main restricted universe multiverse
    deb http://security.ubuntu.com/ubuntu/ focal-security main restricted

  • Ubuntu 22.04 located in /etc/apt/sources.list

    Ubuntu Mirror
    deb http://archive.ubuntu.com/ubuntu/ jammy main restricted
    deb http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
    deb http://archive.ubuntu.com/ubuntu/ jammy-backports main restricted universe multiverse
    deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted

Debian

  • Debian Buster located in /etc/apt/sources.list

    Debian Mirror
    deb http://deb.debian.org/debian buster main
    deb http://security.debian.org/debian-security buster/updates main
    deb http://deb.debian.org/debian buster-updates main
    Aliyun Mirror
    deb http://mirrors.aliyun.com/debian/ buster main non-free contrib
    deb http://mirrors.aliyun.com/debian-security buster/updates main
    deb http://mirrors.aliyun.com/debian/ buster-updates main non-free contrib
    deb http://mirrors.aliyun.com/debian/ buster-backports main non-free contrib
    Tuna Mirror
    deb http://mirrors.tuna.tsinghua.edu.cn/debian/ buster main contrib non-free
    deb http://mirrors.tuna.tsinghua.edu.cn/debian/ buster-updates main contrib non-free
    deb http://mirrors.tuna.tsinghua.edu.cn/debian/ buster-backports main contrib non-free
    deb http://security.debian.org/debian-security buster/updates main contrib non-free

  • Debian Bullseye located in /etc/apt/sources.list

    Debian Mirror
    deb http://deb.debian.org/debian bullseye main
    deb http://security.debian.org/debian-security bullseye-security main
    deb http://deb.debian.org/debian bullseye-updates main
    Aaliyun Mirror
    deb http://mirrors.aliyun.com/debian/ bullseye main non-free contrib
    deb http://mirrors.aliyun.com/debian-security/ bullseye-security main
    deb http://mirrors.aliyun.com/debian/ bullseye-updates main non-free contrib
    deb http://mirrors.aliyun.com/debian/ bullseye-backports main non-free contrib
    Tuna Mirror
    deb http://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye main contrib non-free
    deb http://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye-updates main contrib non-free
    deb http://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye-backports main contrib non-free
    deb http://security.debian.org/debian-security bullseye-security main contrib non-free

Anolis

  • Anolis 3 located in /etc/yum.repos.d/

    Alinyun Mirror
    [alinux3-module]
    name=alinux3-module
    baseurl=http://mirrors.aliyun.com/alinux/3/module/$basearch/
    gpgkey=http://mirrors.aliyun.com/alinux/3/RPM-GPG-KEY-ALINUX-3
    enabled=1
    gpgcheck=1
    
    [alinux3-os]
    name=alinux3-os
    baseurl=http://mirrors.aliyun.com/alinux/3/os/$basearch/
    gpgkey=http://mirrors.aliyun.com/alinux/3/RPM-GPG-KEY-ALINUX-3
    enabled=1
    gpgcheck=1
    
    [alinux3-plus]
    name=alinux3-plus
    baseurl=http://mirrors.aliyun.com/alinux/3/plus/$basearch/
    gpgkey=http://mirrors.aliyun.com/alinux/3/RPM-GPG-KEY-ALINUX-3
    enabled=1
    gpgcheck=1
    
    [alinux3-powertools]
    name=alinux3-powertools
    baseurl=http://mirrors.aliyun.com/alinux/3/powertools/$basearch/
    gpgkey=http://mirrors.aliyun.com/alinux/3/RPM-GPG-KEY-ALINUX-3
    enabled=1
    gpgcheck=1
    
    [alinux3-updates]
    name=alinux3-updates
    baseurl=http://mirrors.aliyun.com/alinux/3/updates/$basearch/
    gpgkey=http://mirrors.aliyun.com/alinux/3/RPM-GPG-KEY-ALINUX-3
    enabled=1
    gpgcheck=1
    
    [epel]
    name=Extra Packages for Enterprise Linux 8 - $basearch
    baseurl=http://mirrors.aliyun.com/epel/8/Everything/$basearch
    failovermethod=priority
    enabled=1
    gpgcheck=1
    gpgkey=http://mirrors.aliyun.com/epel/RPM-GPG-KEY-EPEL-8
    
    [epel-module]
    name=Extra Packages for Enterprise Linux 8 - $basearch
    baseurl=http://mirrors.aliyun.com/epel/8/Modular/$basearch
    failovermethod=priority
    enabled=0
    gpgcheck=1
    gpgkey=http://mirrors.aliyun.com/epel/RPM-GPG-KEY-EPEL-8

  • Anolis 2 located in /etc/yum.repos.d/

    Alinyun Mirror


Refresh Repo

OS:
dnf clean all && dnf makecache
yum clean all && yum makecache
apt-get clean all
Mar 14, 2024

Subsections of App Related

Mirrors [Aliyun, Tsinghua]

Gradle Tencent Mirror

https://mirrors.cloud.tencent.com/gradle/gradle-8.0-bin.zip

PIP Tuna Mirror -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple some-package

Maven Mirror

<mirror>
    <id>aliyunmaven</id>
    <mirrorOf>*</mirrorOf>
    <name>阿里云公共仓库</name>
    <url>https://maven.aliyun.com/repository/public</url>
</mirror>
Mar 7, 2024

Subsections of Git Related

Not Allow Push

Cannot push to your own branch

mvc mvc

  1. Edit .git/config file under your repo directory.

  2. Find url=entry under section [remote "origin"].

  3. Change it from:

    url=https://gitlab.com/AaronYang2333/ska-src-dm-local-data-preparer.git/

    url=ssh://git@gitlab.com/AaronYang2333/ska-src-dm-local-data-preparer.git

  4. try push again

Mar 12, 2025

Subsections of Linux Related

Disable Service

Disable firewall、selinux、dnsmasq、swap service

systemctl disable --now firewalld 
systemctl disable --now dnsmasq
systemctl disable --now NetworkManager

setenforce 0
sed -i 's#SELINUX=permissive#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=permissive#SELINUX=disabled#g' /etc/selinux/config
reboot
getenforce


swapoff -a && sysctl -w vm.swappiness=0
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
Mar 14, 2024

Example Shell Script

Cleanup

  1. find first 10 biggest files
dnf install ncdu

# 找出当前目录下最大的10个文件/目录
du -ah . | sort -rh | head -n 10

# 找出家目录下大于100M的文件
find ~ -type f -size +100M -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'
  1. clean cache
rm -rf ~/.cache/*
sudo rm -rf /tmp/*
sudo rm -rf /var/tmp/*
  1. clean images
# 删除所有停止的容器
podman container prune -y

# 删除所有未被任何容器引用的镜像(悬空镜像)
podman image prune

# 更激进的清理:删除所有未被运行的容器使用的镜像
podman image prune -a

# 清理构建缓存
podman builder prune

# 最彻底的清理:删除所有停止的容器、所有未被容器使用的网络、所有悬空镜像和构建缓存
podman system prune
podman system prune -a # 更加彻底,会删除所有未被使用的镜像,而不仅仅是悬空的
Mar 14, 2024

Example Shell Script

Init ES Backup Setting

create an ES backup setting in s3, and make an snapshot after creation

#!/bin/bash
ES_HOST="http://192.168.58.2:30910"
ES_BACKUP_REPO_NAME="s3_fs_repository"
S3_CLIENT="default"
ES_BACKUP_BUCKET_IN_S3="es-snapshot"
ES_SNAPSHOT_TAG="auto"

CHECK_RESPONSE=$(curl -s -k -X POST "$ES_HOST/_snapshot/$ES_BACKUP_REPO_NAME/_verify?pretty" )
CHECKED_NODES=$(echo "$CHECK_RESPONSE" | jq -r '.nodes')


if [ "$CHECKED_NODES" == null ]; then
  echo "Doesn't exist an ES backup setting..."
  echo "A default backup setting will be generated. (using '$S3_CLIENT' s3 client and all backup files will be saved in a bucket : '$ES_BACKUP_BUCKET_IN_S3'"

  CREATE_RESPONSE=$(curl -s -k -X PUT "$ES_HOST/_snapshot/$ES_BACKUP_REPO_NAME?pretty" -H 'Content-Type: application/json' -d "{\"type\":\"s3\",\"settings\":{\"bucket\":\"$ES_BACKUP_BUCKET_IN_S3\",\"client\":\"$S3_CLIENT\"}}")
  CREATE_ACKNOWLEDGED_FLAG=$(echo "$CREATE_RESPONSE" | jq -r '.acknowledged')

  if [ "$CREATE_ACKNOWLEDGED_FLAG" == true ]; then
    echo "Buckup setting '$ES_BACKUP_REPO_NAME' has been created successfully!"
  else
    echo "Failed to create backup setting '$ES_BACKUP_REPO_NAME', since $$CREATE_RESPONSE"
  fi
else
  echo "Already exist an ES backup setting '$ES_BACKUP_REPO_NAME'"
fi

CHECK_RESPONSE=$(curl -s -k -X POST "$ES_HOST/_snapshot/$ES_BACKUP_REPO_NAME/_verify?pretty" )
CHECKED_NODES=$(echo "$CHECK_RESPONSE" | jq -r '.nodes')

if [ "$CHECKED_NODES" != null ]; then
  SNAPSHOT_NAME="meta-data-$ES_SNAPSHOT_TAG-snapshot-$(date +%s)"
  SNAPSHOT_CREATION=$(curl -s -k -X PUT "$ES_HOST/_snapshot/$ES_BACKUP_REPO_NAME/$SNAPSHOT_NAME")
  echo "Snapshot $SNAPSHOT_NAME has been created."
else
  echo "Failed to create snapshot $SNAPSHOT_NAME ."
fi
Mar 14, 2024

Login Without Pwd

copy id_rsa to other nodes

yum install sshpass -y
mkdir -p /extend/shell

cat >>/extend/shell/fenfa_pub.sh<< EOF
#!/bin/bash
ROOT_PASS=root123
ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ''
for ip in 101 102 103 
do
sshpass -p\$ROOT_PASS ssh-copy-id -o StrictHostKeyChecking=no 192.168.29.\$ip
done
EOF

cd /extend/shell
chmod +x fenfa_pub.sh

./fenfa_pub.sh
Mar 14, 2024

Set Http Proxy

set http proxy

export https_proxy=http://localhost:20171
Mar 14, 2024

Subsections of Storage Related

User Based Policy

User Based Policy

you can change <$bucket> to control the permission

App:
  • ${aws:username} is a build-in variable, indicating the logined user name.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowUserToSeeBucketListInTheConsole",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:GetBucketLocation"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::*"
            ]
        },
        {
            "Sid": "AllowRootAndHomeListingOfCompanyBucket",
            "Action": [
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::<$bucket>"
            ],
            "Condition": {
                "StringEquals": {
                    "s3:prefix": [
                        "",
                        "<$path>/",
                        "<$path>/${aws:username}"
                    ],
                    "s3:delimiter": [
                        "/"
                    ]
                }
            }
        },
        {
            "Sid": "AllowListingOfUserFolder",
            "Action": [
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::<$bucket>"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "<$path>/${aws:username}/*"
                    ]
                }
            }
        },
        {
            "Sid": "AllowAllS3ActionsInUserFolder",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::<$bucket>/<$path>/${aws:username}/*"
            ]
        }
    ]
}
  • <$uid> is Aliyun UID
{
    "Version": "1",
    "Statement": [{
        "Effect": "Allow",
        "Action": [
            "oss:*"
        ],
        "Principal": [
            "<$uid>"
        ],
        "Resource": [
            "acs:oss:*:<$oss_id>:<$bucket>/<$path>/*"
        ]
    }, {
        "Effect": "Allow",
        "Action": [
            "oss:ListObjects",
            "oss:GetObject"
        ],
        "Principal": [
             "<$uid>"
        ],
        "Resource": [
            "acs:oss:*:<$oss_id>:<$bucket>"
        ],
        "Condition": {
            "StringLike": {
            "oss:Prefix": [
                    "<$path>/*"
                ]
            }
        }
    }]
}
Example:
{
	"Version": "1",
	"Statement": [{
		"Effect": "Allow",
		"Action": [
			"oss:*"
		],
		"Principal": [
			"203415213249511533"
		],
		"Resource": [
			"acs:oss:*:1007296819402486:conti-csst/test/*"
		]
	}, {
		"Effect": "Allow",
		"Action": [
			"oss:ListObjects",
			"oss:GetObject"
		],
		"Principal": [
			"203415213249511533"
		],
		"Resource": [
			"acs:oss:*:1007296819402486:conti-csst"
		],
		"Condition": {
			"StringLike": {
				"oss:Prefix": [
					"test/*"
				]
			}
		}
	}]
}
Mar 14, 2024

Subsections of Command

Git CMD

Init global config

git config --list
git config --global user.name "AaronYang"
git config --global user.email aaron19940628@gmail.com
git config --global pager.branch false
git config --global pull.ff only
git --no-pager diff

change user and email (locally)

git config user.name ""
git config user.email ""

list all remote repo

git remote -v
modify remote repo
git remote set-url origin git@github.com:<$user>/<$repo>.git

Get specific file from remote

git archive --remote=git@github.com:<$user>/<$repo>.git <$branch>:<$source_file_path> -o <$target_source_path>
for example
git archive --remote=git@github.com:AaronYang2333/LOL_Overlay_Assistant_Tool.git master:paper/2003.11755.pdf -o a.pdf

Clone specific branch

git clone -b slurm-23.02 --single-branch --depth=1 https://github.com/SchedMD/slurm.git

Update submodule

git submodule add –depth 1 https://github.com/xxx/xxxx a/b/c

git submodule update --init --recursive

Save credential

login first and then execute this

git config --global credential.helper store

Delete Branch

  • Deleting a remote branch
    git push origin --delete <branch>  # Git version 1.7.0 or newer
    git push origin -d <branch>        # Shorter version (Git 1.7.0 or newer)
    git push origin :<branch>          # Git versions older than 1.7.0
  • Deleting a local branch
    git branch --delete <branch>
    git branch -d <branch> # Shorter version
    git branch -D <branch> # Force-delete un-merged branches

Prune remote branches

git remote prune origin

Update remote repo

git remote set-url origin http://xxxxx.git
Mar 7, 2024

Linux

useradd

sudo useradd <$name> -m -r -s /bin/bash -p <$password>
add as soduer
echo '<$name> ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers

telnet

a command line interface for communication with a remote device or serve

telnet <$ip> <$port>
for example
telnet 172.27.253.50 9000 #test application connectivity

lsof (list as open files)

everything is a file

lsof <$option:value>
for example

-a List processes that have open files

-c <process_name> List files opened by the specified process

-g List GID number process details

-d <file_number> List the processes occupying this file number

-d List open files in a directory

-D Recursively list open files in a directory

-n List files using NFS

-i List eligible processes. (protocol, :port, @ip)

-p List files opened by the specified process ID

-u List UID number process details

lsof -i:30443 # find port 30443 
lsof -i -P -n # list all connections

awk (Aho, Weinberger, and Kernighan [Names])

awk is a scripting language used for manipulating data and generating reports.

# awk [params] 'script' 
awk <$params> <$string_content>
for example

filter bigger than 3

echo -e "1\n2\n3\n4\n5\n" | awk '$1>3'

func1 func1

ss (socket statistics)

view detailed information about your system’s network connections, including TCP/IP, UDP, and Unix domain sockets

ss [options]
for example
OptionsDescription
-tDisplay TCP sockets
-lDisplay listening sockets
-nShow numerical addresses instead of resolving
-aDisplay all sockets (listening and non-listening)
#show all listening TCP connection
ss -tln
#show all established TCP connections
ss -tan

clean files 3 days ago

find /aaa/bbb/ccc/*.gz -mtime +3 -exec rm {} \;

ssh without affect $HOME/.ssh/known_hosts

ssh -o "UserKnownHostsFile /dev/null" root@aaa.domain.com
ssh -o "UserKnownHostsFile /dev/null" -o "StrictHostKeyChecking=no" root@aaa.domain.com

sync clock

[yum|dnf] install -y chrony \
    && systemctl enable chronyd \
    && (systemctl is-active chronyd || systemctl start chronyd) \
    && chronyc sources \
    && chronyc tracking \
    && timedatectl set-timezone 'Asia/Shanghai'

set hostname

hostnamectl set-hostname develop

add remote key to other server

ssh -o "UserKnownHostsFile /dev/null" \
    root@aaa.bbb.ccc \
    "mkdir -p /root/.ssh && chmod 700 /root/.ssh && echo '$SOME_PUBLIC_KEY' \
    >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys"
for example
ssh -o "UserKnownHostsFile /dev/null" \
    root@17.27.253.67 \
    "mkdir -p /root/.ssh && chmod 700 /root/.ssh && echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC00JLKF/Cd//rJcdIVGCX3ePo89KAgEccvJe4TEHs5pI5FSxs/7/JfQKZ+by2puC3IT88bo/d7nStw9PR3BXgqFXaBCknNBpSLWBIuvfBF+bcL+jGnQYo2kPjrO+2186C5zKGuPRi9sxLI5AkamGB39L5SGqwe5bbKq2x/8OjUP25AlTd99XsNjEY2uxNVClHysExVad/ZAcl0UVzG5xmllusXCsZVz9HlPExqB6K1sfMYWvLVgSCChx6nUfgg/NZrn/kQG26X0WdtXVM2aXpbAtBioML4rWidsByDb131NqYpJF7f+x3+I5pQ66Qpc72FW1G4mUiWWiGhF9tL8V9o1AY96Rqz0AVaxAQrBEuyCWKrXbA97HeC3Xp57Luvlv9TqUd8CIJYq+QTL0hlIDrzK9rJsg34FRAvf9sh8K2w/T/gC9UnRjRXgkPUgKldq35Y6Z9wP6KY45gCXka1PU4nVqb6wicO+RHcZ5E4sreUwqfTypt5nTOgW2/p8iFhdN8= Administrator@AARON-X1-8TH' \
    >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys"

set -x

This will print each command to the standard error before executing it, which is useful for debugging scripts.

set -x

set -e

Exit immediately if a command exits with a non-zero status.

set -x

sed (Stream Editor)

sed <$option> <$file_path>
for example

replace unix -> linux

echo "linux is great os. unix is opensource. unix is free os." | sed 's/unix/linux/'

or you can check https://www.geeksforgeeks.org/sed-command-in-linux-unix-with-examples/

fdisk

list all disk

fdisk -l

create CFS file system

Use mkfs.xfs command to create xfs file system and internal log on the same disk, example is shown below:

mkfs.xfs <$path>

modprobe

program to add and remove modules from the Linux Kernel

modprobe nfs && modprobe nfsd

disown

disown command in Linux is used to remove jobs from the job table.

disown [options] jobID1 jobID2 ... jobIDN
for example

for example, there is a job running in the background

ping google.com > /dev/null &

using jobs - to list all running jobs

jobs -l

using disown -a remove all jobs from the job tables

disown -a

using disown %2 to remove the #2 job

disown %2

generate SSH key

ssh-keygen -t rsa -b 4096 -C "aaron19940628@gmail.com"
sudo ln -sf <$install_path>/bin/* /usr/local/bin

append dir into $PATH (temporary)

export PATH="/root/bin:$PATH"

copy public key to ECS

ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.200.60.53
Mar 12, 2024

Maven

1. build from submodule

You dont need to build from the head of project.

./mvnw clean package -DskipTests  -rf :<$submodule-name>

you can find the <$submodule-name> from submodule ’s pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
		xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

	<modelVersion>4.0.0</modelVersion>

	<parent>
		<groupId>org.apache.flink</groupId>
		<artifactId>flink-formats</artifactId>
		<version>1.20-SNAPSHOT</version>
	</parent>

	<artifactId>flink-avro</artifactId>
	<name>Flink : Formats : Avro</name>

Then you can modify the command as

./mvnw clean package -DskipTests  -rf :flink-avro
The result will look like this
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 
[INFO] ------------------------------------------------------------------------
[INFO] Detecting the operating system and CPU architecture
[INFO] ------------------------------------------------------------------------
[INFO] os.detected.name: linux
[INFO] os.detected.arch: x86_64
[INFO] os.detected.bitness: 64
[INFO] os.detected.version: 6.7
[INFO] os.detected.version.major: 6
[INFO] os.detected.version.minor: 7
[INFO] os.detected.release: fedora
[INFO] os.detected.release.version: 38
[INFO] os.detected.release.like.fedora: true
[INFO] os.detected.classifier: linux-x86_64
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] Flink : Formats : Avro                                             [jar]
[INFO] Flink : Formats : SQL Avro                                         [jar]
[INFO] Flink : Formats : Parquet                                          [jar]
[INFO] Flink : Formats : SQL Parquet                                      [jar]
[INFO] Flink : Formats : Orc                                              [jar]
[INFO] Flink : Formats : SQL Orc                                          [jar]
[INFO] Flink : Python                                                     [jar]
...

Normally, build Flink will start from module flink-parent

2. skip some other test

For example, you can skip RAT test by doing this:

./mvnw clean package -DskipTests '-Drat.skip=true'
Mar 11, 2024

Gradle

1. spotless

keep your code spotless, check more detail in https://github.com/diffplug/spotless

see how to configuration

there are several files need to configure.

  1. settings.gradle.kts
plugins {
    id("org.gradle.toolchains.foojay-resolver-convention") version "0.7.0"
}
  1. build.gradle.kts
plugins {
    id("com.diffplug.spotless") version "6.23.3"
}
configure<com.diffplug.gradle.spotless.SpotlessExtension> {
    kotlinGradle {
        target("**/*.kts")
        ktlint()
    }
    java {
        target("**/*.java")
        googleJavaFormat()
            .reflowLongStrings()
            .skipJavadocFormatting()
            .reorderImports(false)
    }
    yaml {
        target("**/*.yaml")
        jackson()
            .feature("ORDER_MAP_ENTRIES_BY_KEYS", true)
    }
    json {
        target("**/*.json")
        targetExclude(".vscode/settings.json")
        jackson()
            .feature("ORDER_MAP_ENTRIES_BY_KEYS", true)
    }
}

And the, you can execute follwoing command to format your code.

./gradlew spotlessApply
./mvnw spotless:apply

2. shadowJar

shadowjar could combine a project’s dependency classes and resources into a single jar. check https://imperceptiblethoughts.com/shadow/

see how to configuration

you need moidfy your build.gradle.kts

import com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar

plugins {
    java // Optional 
    id("com.github.johnrengelman.shadow") version "8.1.1"
}

tasks.named<ShadowJar>("shadowJar") {
    archiveBaseName.set("connector-shadow")
    archiveVersion.set("1.0")
    archiveClassifier.set("")
    manifest {
        attributes(mapOf("Main-Class" to "com.example.xxxxx.Main"))
    }
}
./gradlew shadowJar

3. check dependency

list your project’s dependencies in tree view

see how to configuration

you need moidfy your build.gradle.kts

configurations {
    compileClasspath
}
./gradlew dependencies --configuration compileClasspath
./gradlew :<$module_name>:dependencies --configuration compileClasspath
Check Potential Result

result will look like this

compileClasspath - Compile classpath for source set 'main'.
+--- org.projectlombok:lombok:1.18.22
+--- org.apache.flink:flink-hadoop-fs:1.17.1
|    \--- org.apache.flink:flink-core:1.17.1
|         +--- org.apache.flink:flink-annotations:1.17.1
|         |    \--- com.google.code.findbugs:jsr305:1.3.9 -> 3.0.2
|         +--- org.apache.flink:flink-metrics-core:1.17.1
|         |    \--- org.apache.flink:flink-annotations:1.17.1 (*)
|         +--- org.apache.flink:flink-shaded-asm-9:9.3-16.1
|         +--- org.apache.flink:flink-shaded-jackson:2.13.4-16.1
|         +--- org.apache.commons:commons-lang3:3.12.0
|         +--- org.apache.commons:commons-text:1.10.0
|         |    \--- org.apache.commons:commons-lang3:3.12.0
|         +--- commons-collections:commons-collections:3.2.2
|         +--- org.apache.commons:commons-compress:1.21 -> 1.24.0
|         +--- org.apache.flink:flink-shaded-guava:30.1.1-jre-16.1
|         \--- com.google.code.findbugs:jsr305:1.3.9 -> 3.0.2
...
Mar 7, 2024

Elastic Search DSL

Basic Query

exist query

Returns documents that contain an indexed value for a field.

GET /_search
{
  "query": {
    "exists": {
      "field": "user"
    }
  }
}

The following search returns documents that are missing an indexed value for the user.id field.

GET /_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "user.id"
        }
      }
    }
  }
}
fuzz query

Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

GET /_search
{
  "query": {
    "fuzzy": {
      "filed_A": {
        "value": "ki"
      }
    }
  }
}

Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

GET /_search
{
  "query": {
    "fuzzy": {
      "filed_A": {
        "value": "ki",
        "fuzziness": "AUTO",
        "max_expansions": 50,
        "prefix_length": 0,
        "transpositions": true,
        "rewrite": "constant_score_blended"
      }
    }
  }
}

rewrite:

  • constant_score_boolean
  • constant_score_filter
  • top_terms_blended_freqs_N
  • top_terms_boost_N, top_terms_N
  • frequent_terms, score_delegating
ids query

Returns documents based on their IDs. This query uses document IDs stored in the _id field.

GET /_search
{
  "query": {
    "ids" : {
      "values" : ["2NTC5ZIBNLuBWC6V5_0Y"]
    }
  }
}
prefix query

The following search returns documents where the filed_A field contains a term that begins with ki.

GET /_search
{
  "query": {
    "prefix": {
      "filed_A": {
        "value": "ki",
         "rewrite": "constant_score_blended",
         "case_insensitive": true
      }
    }
  }
}

You can simplify the prefix query syntax by combining the <field> and value parameters.

GET /_search
{
  "query": {
    "prefix" : { "filed_A" : "ki" }
  }
}
range query

Returns documents that contain terms within a provided range.

GET /_search
{
  "query": {
    "range": {
      "filed_number": {
        "gte": 10,
        "lte": 20,
        "boost": 2.0
      }
    }
  }
}
GET /_search
{
  "query": {
    "range": {
      "filed_timestamp": {
        "time_zone": "+01:00",        
        "gte": "2020-01-01T00:00:00", 
        "lte": "now"                  
      }
    }
  }
}
regex query

Returns documents that contain terms matching a regular expression.

GET /_search
{
  "query": {
    "regexp": {
      "filed_A": {
        "value": "k.*y",
        "flags": "ALL",
        "case_insensitive": true,
        "max_determinized_states": 10000,
        "rewrite": "constant_score_blended"
      }
    }
  }
}
term query

Returns documents that contain an exact term in a provided field.

You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.

GET /_search
{
  "query": {
    "term": {
      "filed_A": {
        "value": "kimchy",
        "boost": 1.0
      }
    }
  }
}
wildcard query

Returns documents that contain terms matching a wildcard pattern.

A wildcard operator is a placeholder that matches one or more characters. For example, the * wildcard operator matches zero or more characters. You can combine wildcard operators with other characters to create a wildcard pattern.

GET /_search
{
  "query": {
    "wildcard": {
      "filed_A": {
        "value": "ki*y",
        "boost": 1.0,
        "rewrite": "constant_score_blended"
      }
    }
  }
}