Command

Git CMD

Init global config

git config --list
git config --global user.name "AaronYang"
git config --global user.email aaron19940628@gmail.com
git config --global pager.branch false
git config --global pull.ff only
git --no-pager diff

change user and email (locally)

git config user.name ""
git config user.email ""

list all remote repo

git remote -v

modify remote repo

git remote set-url origin git@github.com:<$user>/<$repo>.git

Get specific file from remote

git archive --remote=git@github.com:<$user>/<$repo>.git <$branch>:<$source_file_path> -o <$target_source_path>

for example

git archive --remote=git@github.com:AaronYang2333/LOL_Overlay_Assistant_Tool.git master:paper/2003.11755.pdf -o a.pdf

Clone specific branch

git clone -b slurm-23.02 --single-branch --depth=1 https://github.com/SchedMD/slurm.git

Update submodule

git submodule add –depth 1 https://github.com/xxx/xxxx a/b/c

git submodule update --init --recursive

Save credential

git config --global credential.helper store

Delete Branch

Deleting a remote branch

git push origin --delete <branch>  # Git version 1.7.0 or newer
git push origin -d <branch>        # Shorter version (Git 1.7.0 or newer)
git push origin :<branch>          # Git versions older than 1.7.0

Deleting a local branch

git branch --delete <branch>
git branch -d <branch> # Shorter version
git branch -D <branch> # Force-delete un-merged branches

Prune remote branches

git remote prune origin

Update remote repo

git remote set-url origin http://xxxxx.git

Linux

useradd

sudo useradd <$name> -m -r -s /bin/bash -p <$password>

add as soduer

echo '<$name> ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers

telnet

a command line interface for communication with a remote device or serve

telnet <$ip> <$port>

for example

telnet 172.27.253.50 9000 #test application connectivity

lsof (list as open files)

everything is a file

lsof <$option:value>

for example

-a List processes that have open files

-c <process_name> List files opened by the specified process

-g List GID number process details

-d <file_number> List the processes occupying this file number

-d List open files in a directory

-D Recursively list open files in a directory

-n List files using NFS

-i List eligible processes. (protocol, :port, @ip)

-p List files opened by the specified process ID

-u List UID number process details

lsof -i:30443 # find port 30443 
lsof -i -P -n # list all connections

awk (Aho, Weinberger, and Kernighan [Names])

awk is a scripting language used for manipulating data and generating reports.

# awk [params] 'script' 
awk <$params> <$string_content>

for example

filter bigger than 3

echo -e "1\n2\n3\n4\n5\n" | awk '$1>3'

ss (socket statistics)

view detailed information about your system’s network connections, including TCP/IP, UDP, and Unix domain sockets

ss [options]

for example

Options	Description
-t	Display TCP sockets
-l	Display listening sockets
-n	Show numerical addresses instead of resolving
-a	Display all sockets (listening and non-listening)

#show all listening TCP connection
ss -tln

#show all established TCP connections
ss -tan

clean files 3 days ago

find /aaa/bbb/ccc/*.gz -mtime +3 -exec rm {} \;

ssh without affect $HOME/.ssh/known_hosts

ssh -o "UserKnownHostsFile /dev/null" root@aaa.domain.com
ssh -o "UserKnownHostsFile /dev/null" -o "StrictHostKeyChecking=no" root@aaa.domain.com

sync clock

[yum|dnf] install -y chrony \
    && systemctl enable chronyd \
    && (systemctl is-active chronyd || systemctl start chronyd) \
    && chronyc sources \
    && chronyc tracking \
    && timedatectl set-timezone 'Asia/Shanghai'

set hostname

hostnamectl set-hostname develop

add remote key to other server

ssh -o "UserKnownHostsFile /dev/null" \
    root@aaa.bbb.ccc \
    "mkdir -p /root/.ssh && chmod 700 /root/.ssh && echo '$SOME_PUBLIC_KEY' \
    >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys"

for example

ssh -o "UserKnownHostsFile /dev/null" \
    root@17.27.253.67 \
    "mkdir -p /root/.ssh && chmod 700 /root/.ssh && echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC00JLKF/Cd//rJcdIVGCX3ePo89KAgEccvJe4TEHs5pI5FSxs/7/JfQKZ+by2puC3IT88bo/d7nStw9PR3BXgqFXaBCknNBpSLWBIuvfBF+bcL+jGnQYo2kPjrO+2186C5zKGuPRi9sxLI5AkamGB39L5SGqwe5bbKq2x/8OjUP25AlTd99XsNjEY2uxNVClHysExVad/ZAcl0UVzG5xmllusXCsZVz9HlPExqB6K1sfMYWvLVgSCChx6nUfgg/NZrn/kQG26X0WdtXVM2aXpbAtBioML4rWidsByDb131NqYpJF7f+x3+I5pQ66Qpc72FW1G4mUiWWiGhF9tL8V9o1AY96Rqz0AVaxAQrBEuyCWKrXbA97HeC3Xp57Luvlv9TqUd8CIJYq+QTL0hlIDrzK9rJsg34FRAvf9sh8K2w/T/gC9UnRjRXgkPUgKldq35Y6Z9wP6KY45gCXka1PU4nVqb6wicO+RHcZ5E4sreUwqfTypt5nTOgW2/p8iFhdN8= Administrator@AARON-X1-8TH' \
    >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys"

set -x

This will print each command to the standard error before executing it, which is useful for debugging scripts.

set -x

set -e

Exit immediately if a command exits with a non-zero status.

set -x

sed (Stream Editor)

sed <$option> <$file_path>

for example

replace unix -> linux

echo "linux is great os. unix is opensource. unix is free os." | sed 's/unix/linux/'

or you can check https://www.geeksforgeeks.org/sed-command-in-linux-unix-with-examples/

fdisk

list all disk

fdisk -l

create CFS file system

Use mkfs.xfs command to create xfs file system and internal log on the same disk, example is shown below:

mkfs.xfs <$path>

modprobe

program to add and remove modules from the Linux Kernel

modprobe nfs && modprobe nfsd

disown

disown command in Linux is used to remove jobs from the job table.

disown [options] jobID1 jobID2 ... jobIDN

for example

for example, there is a job running in the background

ping google.com > /dev/null &

using jobs - to list all running jobs

jobs -l

using disown -a remove all jobs from the job tables

disown -a

using disown %2 to remove the #2 job

disown %2

generate SSH key

ssh-keygen -t rsa -b 4096 -C "aaron19940628@gmail.com"

create soft link

sudo ln -sf <$install_path>/bin/* /usr/local/bin

append dir into $PATH (temporary)

export PATH="/root/bin:$PATH"

copy public key to ECS

ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.200.60.53

Maven

1. build from submodule

You dont need to build from the head of project.

./mvnw clean package -DskipTests  -rf :<$submodule-name>

you can find the <$submodule-name> from submodule ’s pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
		xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

	<modelVersion>4.0.0</modelVersion>

	<parent>
		<groupId>org.apache.flink</groupId>
		<artifactId>flink-formats</artifactId>
		<version>1.20-SNAPSHOT</version>
	</parent>

	<artifactId>flink-avro</artifactId>
	<name>Flink : Formats : Avro</name>

Then you can modify the command as

./mvnw clean package -DskipTests  -rf :flink-avro

The result will look like this

[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 
[INFO] ------------------------------------------------------------------------
[INFO] Detecting the operating system and CPU architecture
[INFO] ------------------------------------------------------------------------
[INFO] os.detected.name: linux
[INFO] os.detected.arch: x86_64
[INFO] os.detected.bitness: 64
[INFO] os.detected.version: 6.7
[INFO] os.detected.version.major: 6
[INFO] os.detected.version.minor: 7
[INFO] os.detected.release: fedora
[INFO] os.detected.release.version: 38
[INFO] os.detected.release.like.fedora: true
[INFO] os.detected.classifier: linux-x86_64
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] Flink : Formats : Avro                                             [jar]
[INFO] Flink : Formats : SQL Avro                                         [jar]
[INFO] Flink : Formats : Parquet                                          [jar]
[INFO] Flink : Formats : SQL Parquet                                      [jar]
[INFO] Flink : Formats : Orc                                              [jar]
[INFO] Flink : Formats : SQL Orc                                          [jar]
[INFO] Flink : Python                                                     [jar]
...

Normally, build Flink will start from module flink-parent

2. skip some other test

For example, you can skip RAT test by doing this:

./mvnw clean package -DskipTests '-Drat.skip=true'

Gradle

1. spotless

keep your code spotless, check more detail in https://github.com/diffplug/spotless

see how to configuration

there are several files need to configure.

settings.gradle.kts

plugins {
    id("org.gradle.toolchains.foojay-resolver-convention") version "0.7.0"
}

build.gradle.kts

plugins {
    id("com.diffplug.spotless") version "6.23.3"
}
configure<com.diffplug.gradle.spotless.SpotlessExtension> {
    kotlinGradle {
        target("**/*.kts")
        ktlint()
    }
    java {
        target("**/*.java")
        googleJavaFormat()
            .reflowLongStrings()
            .skipJavadocFormatting()
            .reorderImports(false)
    }
    yaml {
        target("**/*.yaml")
        jackson()
            .feature("ORDER_MAP_ENTRIES_BY_KEYS", true)
    }
    json {
        target("**/*.json")
        targetExclude(".vscode/settings.json")
        jackson()
            .feature("ORDER_MAP_ENTRIES_BY_KEYS", true)
    }
}

And the, you can execute follwoing command to format your code.

./gradlew spotlessApply

./mvnw spotless:apply

2. shadowJar

shadowjar could combine a project’s dependency classes and resources into a single jar. check https://imperceptiblethoughts.com/shadow/

see how to configuration

you need moidfy your build.gradle.kts

import com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar

plugins {
    java // Optional 
    id("com.github.johnrengelman.shadow") version "8.1.1"
}

tasks.named<ShadowJar>("shadowJar") {
    archiveBaseName.set("connector-shadow")
    archiveVersion.set("1.0")
    archiveClassifier.set("")
    manifest {
        attributes(mapOf("Main-Class" to "com.example.xxxxx.Main"))
    }
}

./gradlew shadowJar

3. check dependency

list your project’s dependencies in tree view

see how to configuration

you need moidfy your build.gradle.kts

configurations {
    compileClasspath
}

./gradlew dependencies --configuration compileClasspath

./gradlew :<$module_name>:dependencies --configuration compileClasspath

Check Potential Result

result will look like this

compileClasspath - Compile classpath for source set 'main'.
+--- org.projectlombok:lombok:1.18.22
+--- org.apache.flink:flink-hadoop-fs:1.17.1
|    \--- org.apache.flink:flink-core:1.17.1
|         +--- org.apache.flink:flink-annotations:1.17.1
|         |    \--- com.google.code.findbugs:jsr305:1.3.9 -> 3.0.2
|         +--- org.apache.flink:flink-metrics-core:1.17.1
|         |    \--- org.apache.flink:flink-annotations:1.17.1 (*)
|         +--- org.apache.flink:flink-shaded-asm-9:9.3-16.1
|         +--- org.apache.flink:flink-shaded-jackson:2.13.4-16.1
|         +--- org.apache.commons:commons-lang3:3.12.0
|         +--- org.apache.commons:commons-text:1.10.0
|         |    \--- org.apache.commons:commons-lang3:3.12.0
|         +--- commons-collections:commons-collections:3.2.2
|         +--- org.apache.commons:commons-compress:1.21 -> 1.24.0
|         +--- org.apache.flink:flink-shaded-guava:30.1.1-jre-16.1
|         \--- com.google.code.findbugs:jsr305:1.3.9 -> 3.0.2
...

Elastic Search DSL

Basic Query

exist query

Returns documents that contain an indexed value for a field.

GET /_search
{
  "query": {
    "exists": {
      "field": "user"
    }
  }
}

The following search returns documents that are missing an indexed value for the user.id field.

GET /_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "user.id"
        }
      }
    }
  }
}

fuzz query

Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

GET /_search
{
  "query": {
    "fuzzy": {
      "filed_A": {
        "value": "ki"
      }
    }
  }
}

Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

GET /_search
{
  "query": {
    "fuzzy": {
      "filed_A": {
        "value": "ki",
        "fuzziness": "AUTO",
        "max_expansions": 50,
        "prefix_length": 0,
        "transpositions": true,
        "rewrite": "constant_score_blended"
      }
    }
  }
}

rewrite:

constant_score_boolean
constant_score_filter
top_terms_blended_freqs_N
top_terms_boost_N, top_terms_N
frequent_terms, score_delegating

ids query

Returns documents based on their IDs. This query uses document IDs stored in the _id field.

GET /_search
{
  "query": {
    "ids" : {
      "values" : ["2NTC5ZIBNLuBWC6V5_0Y"]
    }
  }
}

prefix query

The following search returns documents where the filed_A field contains a term that begins with ki.

GET /_search
{
  "query": {
    "prefix": {
      "filed_A": {
        "value": "ki",
         "rewrite": "constant_score_blended",
         "case_insensitive": true
      }
    }
  }
}

You can simplify the prefix query syntax by combining the <field> and value parameters.

GET /_search
{
  "query": {
    "prefix" : { "filed_A" : "ki" }
  }
}

range query

Returns documents that contain terms within a provided range.

GET /_search
{
  "query": {
    "range": {
      "filed_number": {
        "gte": 10,
        "lte": 20,
        "boost": 2.0
      }
    }
  }
}

GET /_search
{
  "query": {
    "range": {
      "filed_timestamp": {
        "time_zone": "+01:00",        
        "gte": "2020-01-01T00:00:00", 
        "lte": "now"                  
      }
    }
  }
}

regex query

Returns documents that contain terms matching a regular expression.

GET /_search
{
  "query": {
    "regexp": {
      "filed_A": {
        "value": "k.*y",
        "flags": "ALL",
        "case_insensitive": true,
        "max_determinized_states": 10000,
        "rewrite": "constant_score_blended"
      }
    }
  }
}

term query

Returns documents that contain an exact term in a provided field.

You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.

GET /_search
{
  "query": {
    "term": {
      "filed_A": {
        "value": "kimchy",
        "boost": 1.0
      }
    }
  }
}

wildcard query

Returns documents that contain terms matching a wildcard pattern.

A wildcard operator is a placeholder that matches one or more characters. For example, the * wildcard operator matches zero or more characters. You can combine wildcard operators with other characters to create a wildcard pattern.

GET /_search
{
  "query": {
    "wildcard": {
      "filed_A": {
        "value": "ki*y",
        "boost": 1.0,
        "rewrite": "constant_score_blended"
      }
    }
  }
}

Command

Subsections of Command

Git CMD

Init global config

change user and email (locally)

list all remote repo

Get specific file from remote

Clone specific branch

Update submodule

Save credential

Delete Branch

Prune remote branches

Update remote repo

Linux

useradd

telnet

lsof (list as open files)

awk (Aho, Weinberger, and Kernighan [Names])

ss (socket statistics)

clean files 3 days ago

ssh without affect $HOME/.ssh/known_hosts

sync clock

set hostname

add remote key to other server

set -x

set -e

sed (Stream Editor)

fdisk

create CFS file system

modprobe

disown

generate SSH key

create soft link

append dir into $PATH (temporary)

copy public key to ECS

Maven

1. build from submodule

2. skip some other test

Gradle

1. spotless

2. shadowJar

3. check dependency

Elastic Search DSL

Basic Query